Re: [basex-talk] Running basexclient Command From a Bash Script
Small hick-up: the script works under OS X (and I'll assume under Linux) but not in the git shell under Windows. I should be able to use a separate batch script for Windows but it would be easiest if a single script was cross platform. Hmph. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/22/15, 3:44 PM, Eliot Kimber ekim...@contrext.com wrote: Cool, Thanks for all the info. I knew it was due to my very weak understanding of bash scripting. The improved script you suggested worked. I'll do the reading necessary to understand *why* it works, but that gets me over the immediate hurdle. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/22/15, 11:51 AM, Charles Duffy char...@dyfis.net wrote: [Eliot, apologies for the duplicate mail -- I just realized that my initial replies weren't CC'd to the list. Note that this fixes a bug in the last version I sent you privately; please use it in preference]. --- Please see BashFAQ #50: http://mywiki.wooledge.org/BashFAQ/050; argument lists should be handled as arrays, not strings. ...also, see BashFAQ #1, http://mywiki.wooledge.org/BashFAQ/001, and (for the same reason) http://mywiki.wooledge.org/DontReadLinesWithFor A better version of this script might look like: #!/bin/bash # ^-- MUST NOT BE /bin/sh addOrUpdate() { git diff HEAD^ HEAD --name-only --diff-filter=AM; } basexOptions=( -U admin -P admin -p 1984 -n localhost ) while read -r line; do cmd=( basexclient -c OPEN $basexDatabase; REPLACE $line $topDir/$line; EXIT ${basexOptions[@]} ) printf 'Running: '; printf '%q ' ${cmd[@]}; printf '\n' ${cmd[@]} done (addOrUpdate) On Sun, Mar 22, 2015 at 11:31 AM, Eliot Kimber ekim...@contrext.com wrote: I'm trying to implement agit commit hook that updates a BaseX database with updates committed to the git repo (the BaseX database is being used for search and link management). My script is here: https://github.com/dita-for-small-teams/dfst-git-commit-hooks/blob/devel o p/ post-commit The important code is: -- addOrUpdate=git diff HEAD^ HEAD --name-only --diff-filter=AM # echo Adding/Updating files: for line in `${addOrUpdate}`; do cmd=basexclient -c \OPEN ${basexDatabase};REPLACE ${line} ${topDir}/${line};EXIT\ ${basexOptions} echo Running cmd: ${cmd} $($cmd) done - When I run the script I'm getting this response: Contrext01:docs ekimber$ ../../dita-for-small-teams/commit-hooks/git/post-commit Running cmd: basexclient -c OPEN sample-project;REPLACE docs/topic-01.dita /Users/ekimber/workspace-dfst/dfst-sample-project/docs/docs/topic-01.dit a ;E XIT -U admin -P admin -p 1984 -n localhost Stopped at , 1/6: Unknown command: OPEN. Did you mean 'OPEN'? Contrext01:docs ekimber$ If I copy the command from the message and run it directly from the command line it works as expected. There must be some subtlety of using base in this context but my bash fu is weak and I have no idea what I'm doing wrong--I can't see any obvious user error but there must be one. Can anyone tell me what my bash scripting mistake is? Is there a better way to do this sort of scripted interaction with basex? It didn't appear that BaseX scripts provided a way to take parameters--if they could then I'd just call a .bxs script with the relevant parameters. Short of generating the script and then running it, I couldn't think of a simpler way to do what I want that just using the -c option on the basexclient command. Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Running basexclient Command From a Bash Script
Cool, Thanks for all the info. I knew it was due to my very weak understanding of bash scripting. The improved script you suggested worked. I'll do the reading necessary to understand *why* it works, but that gets me over the immediate hurdle. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/22/15, 11:51 AM, Charles Duffy char...@dyfis.net wrote: [Eliot, apologies for the duplicate mail -- I just realized that my initial replies weren't CC'd to the list. Note that this fixes a bug in the last version I sent you privately; please use it in preference]. --- Please see BashFAQ #50: http://mywiki.wooledge.org/BashFAQ/050; argument lists should be handled as arrays, not strings. ...also, see BashFAQ #1, http://mywiki.wooledge.org/BashFAQ/001, and (for the same reason) http://mywiki.wooledge.org/DontReadLinesWithFor A better version of this script might look like: #!/bin/bash # ^-- MUST NOT BE /bin/sh addOrUpdate() { git diff HEAD^ HEAD --name-only --diff-filter=AM; } basexOptions=( -U admin -P admin -p 1984 -n localhost ) while read -r line; do cmd=( basexclient -c OPEN $basexDatabase; REPLACE $line $topDir/$line; EXIT ${basexOptions[@]} ) printf 'Running: '; printf '%q ' ${cmd[@]}; printf '\n' ${cmd[@]} done (addOrUpdate) On Sun, Mar 22, 2015 at 11:31 AM, Eliot Kimber ekim...@contrext.com wrote: I'm trying to implement agit commit hook that updates a BaseX database with updates committed to the git repo (the BaseX database is being used for search and link management). My script is here: https://github.com/dita-for-small-teams/dfst-git-commit-hooks/blob/develo p/ post-commit The important code is: -- addOrUpdate=git diff HEAD^ HEAD --name-only --diff-filter=AM # echo Adding/Updating files: for line in `${addOrUpdate}`; do cmd=basexclient -c \OPEN ${basexDatabase};REPLACE ${line} ${topDir}/${line};EXIT\ ${basexOptions} echo Running cmd: ${cmd} $($cmd) done - When I run the script I'm getting this response: Contrext01:docs ekimber$ ../../dita-for-small-teams/commit-hooks/git/post-commit Running cmd: basexclient -c OPEN sample-project;REPLACE docs/topic-01.dita /Users/ekimber/workspace-dfst/dfst-sample-project/docs/docs/topic-01.dita ;E XIT -U admin -P admin -p 1984 -n localhost Stopped at , 1/6: Unknown command: OPEN. Did you mean 'OPEN'? Contrext01:docs ekimber$ If I copy the command from the message and run it directly from the command line it works as expected. There must be some subtlety of using base in this context but my bash fu is weak and I have no idea what I'm doing wrong--I can't see any obvious user error but there must be one. Can anyone tell me what my bash scripting mistake is? Is there a better way to do this sort of scripted interaction with basex? It didn't appear that BaseX scripts provided a way to take parameters--if they could then I'd just call a .bxs script with the relevant parameters. Short of generating the script and then running it, I couldn't think of a simpler way to do what I want that just using the -c option on the basexclient command. Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] Configuration File for Basexclient Connection Details?
Looking through the docs and trying to tests locally, it appears that the basexclient command does not use the USER or PASSWORD fields in .basex when run. That is, given this in my ~/.basex file: USER = admin PASSWORD = admin The basexclient command still prompts me for my credentials. Is this correct or am I missing a configuration option somewhere? What I'm looking for is the ability to call the basexclient command from git hooks without those scripts having to maintain their own configuration file for the connection details. Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] Failure on Second Update to Database
I'm getting a failure on the second commit of XML data via the basexclient command (from my git hook code). I've created a gist here: https://gist.github.com/dc9f6d55d891b06ecae9.git with relevant logs and the data/ directory. The exception reported is: == Contrext01:dfst-sample-project ekimber$ git add .;git commit -m updated test file Running: basexclient -c CHECK\ dfst_master\;\ OPEN\ dfst_master\;\ REPLACE\ docs/topic-01.dita\ /Users/ekimber/workspace-dfst/dfst-sample-project/docs/topic-01.dita -U admin -P admin -p 1984 -n localhost Running: basexclient -c CHECK\ dfst_master\;\ OPEN\ dfst_master\;\ REPLACE\ dfst/metadata.xml\ \dfst_metadata\\gitstate\\branch\master\/branch\\commit\8d264e1e7 3d5596758a069a4355920c448a5b41d\/commit\\/gitstate\\/dfst_metadata\ -U admin -P admin -p 1984 -n localhost Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 8.0.3 Java: Oracle Corporation, 1.7.0_65 OS: Mac OS X, x86_64 Stack Trace: java.lang.RuntimeException: Data Access out of bounds: - pre value: 21 - #used blocks: 1 - #total locks: 1 - access: 0 (1 0] at org.basex.util.Util.notExpected(Util.java:60) at org.basex.io.random.TableDiskAccess.cursor(TableDiskAccess.java:462) at org.basex.io.random.TableDiskAccess.read1(TableDiskAccess.java:148) at org.basex.data.Data.kind(Data.java:304) at org.basex.data.atomic.Replace.getInstance(Replace.java:42) at org.basex.data.atomic.AtomicUpdateCache.addReplace(AtomicUpdateCache.java:9 5) at org.basex.core.cmd.Replace.replace(Replace.java:95) at org.basex.core.cmd.Replace.run(Replace.java:57) at org.basex.core.Command.run(Command.java:379) at org.basex.core.Command.execute(Command.java:95) at org.basex.server.ClientListener.run(ClientListener.java:146) [master 8d264e1] updated test file 1 file changed, 1 insertion(+), 1 deletion(-) Contrext01:dfst-sample-project ekimber$ = The first update succeeds, the second one fails. The second one is committing literal XML on the command while the first is committing a file on the file system. This behavior is consistent: if I drop the database, restart the server, the repeat this test, I get the same failure. Is this my user error or a bug? Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Failure on Second Update to Database
Are you behind a VPN? I just tried to clone it from a machine that had a VPN connection going but it failed. When I turned the VPN off then the clone succeeded. Unless it's an issue with GitHub itself. I've but the data on Dropbox here: https://dl.dropboxusercontent.com/u/20078596/gist-basex-failure.zip Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/25/15, 12:40 PM, Christian Grün christian.gr...@gmail.com wrote: https://gist.github.com/dc9f6d55d891b06ecae9.git Hm, it still gives me 404.. I also tried to download it from a machine with a different IP (both in Germany). Using http or https makes no difference either! I just verified that it's not private and that I could clone it (at least with my GitHub credentials). Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/25/15, 12:28 PM, Christian Grün christian.gr...@gmail.com wrote: Eliot, Thanks for reporting this. The gist does not seem to exist (anymore); could you please check it again? Thanks in advance, Christian On Wed, Mar 25, 2015 at 3:58 PM, Eliot Kimber ekim...@contrext.com wrote: I'm getting a failure on the second commit of XML data via the basexclient command (from my git hook code). I've created a gist here: https://gist.github.com/dc9f6d55d891b06ecae9.git with relevant logs and the data/ directory. The exception reported is: == Contrext01:dfst-sample-project ekimber$ git add .;git commit -m updated test file Running: basexclient -c CHECK\ dfst_master\;\ OPEN\ dfst_master\;\ REPLACE\ docs/topic-01.dita\ /Users/ekimber/workspace-dfst/dfst-sample-project/docs/topic-01.dita -U admin -P admin -p 1984 -n localhost Running: basexclient -c CHECK\ dfst_master\;\ OPEN\ dfst_master\;\ REPLACE\ dfst/metadata.xml\ \dfst_metadata\\gitstate\\branch\master\/branch\\commit\8d264 e1 e7 3d5596758a069a4355920c448a5b41d\/commit\\/gitstate\\/dfst_metadata \ -U admin -P admin -p 1984 -n localhost Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 8.0.3 Java: Oracle Corporation, 1.7.0_65 OS: Mac OS X, x86_64 Stack Trace: java.lang.RuntimeException: Data Access out of bounds: - pre value: 21 - #used blocks: 1 - #total locks: 1 - access: 0 (1 0] at org.basex.util.Util.notExpected(Util.java:60) at org.basex.io.random.TableDiskAccess.cursor(TableDiskAccess.java:462) at org.basex.io.random.TableDiskAccess.read1(TableDiskAccess.java:148) at org.basex.data.Data.kind(Data.java:304) at org.basex.data.atomic.Replace.getInstance(Replace.java:42) at org.basex.data.atomic.AtomicUpdateCache.addReplace(AtomicUpdateCache.ja va :9 5) at org.basex.core.cmd.Replace.replace(Replace.java:95) at org.basex.core.cmd.Replace.run(Replace.java:57) at org.basex.core.Command.run(Command.java:379) at org.basex.core.Command.execute(Command.java:95) at org.basex.server.ClientListener.run(ClientListener.java:146) [master 8d264e1] updated test file 1 file changed, 1 insertion(+), 1 deletion(-) Contrext01:dfst-sample-project ekimber$ = The first update succeeds, the second one fails. The second one is committing literal XML on the command while the first is committing a file on the file system. This behavior is consistent: if I drop the database, restart the server, the repeat this test, I get the same failure. Is this my user error or a bug? Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Configuration File for Basexclient Connection Details?
When I removed the .basexhome and .basex files from the basex/ directory, then the home directory .basex worked. So it looks like there's definitely a difference in behavior between Windows and OS X. I verified that under Windows if I have both a ~/.basex file and the default .basex and .basexhome in the installation directory, the user and password from the ~/.basex file is used. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/25/15, 1:25 PM, Christian Grün christian.gr...@gmail.com wrote: However, when I ran basexclient, I got this: Contrext01:dfst-sample-project ekimber$ basexclient /Users/ekimber/apps/basex/.basex: writing new configuration file. Username: Do you possibly have any other .basex* file in this directory? So maybe this is an OS X issue? Maybe someone else using OSX can have a word on this? Christian
Re: [basex-talk] Simple xQuery functions do not work as expected
Note that I'm not using the GUI, I'm using the basexclient command via a bash script. I have the basexserver running as a service (meaning it's a background task under OS X and then I'm using git commit hooks to run bash scripts that call the basexclient command. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/25/15, 12:54 PM, Christian Grün christian.gr...@gmail.com wrote: Phew, it's really difficult to say anything on this.As long as we cannot reproduce this on at least a second machine, it's hardly possible to tell what's happening here. As indicated, I haven't experienced a similar behavior so far, neither on my own system, which is pretty similar to yours, nor on any server machines in our productive instances. But of course I couldn't seriously recommend to go on with your current setup as long as the problem persists. If you want to continue working on this, maybe some more questions: 1. Does this only happen in the GUI or also on command line? 2. Do you have any other JVM instance running beside the single BaseX GUI instance? 3. Do you get the same error when restarting the GUI? Christian On Wed, Mar 25, 2015 at 3:03 PM, Goetz Heller hel...@hellerim.de wrote: Hi Christian, this time, both the function you've seen already and a different query which tries to extract data from a larger Xml file fail with the behavior described. In both query execution plans you can see that there is some optimization which places an empty pair of parentheses into the optimized query where something different should be. It looks as if an shortage of some - memory? - resources throws an exception which is silently caught somewhere. I know from my own programming experience that it happens like that: a boring nuisance is caught in the middle of something nice being developed so you neutralize this effect temporarily - and then simply forget it. Is it possible that something similar happened in this case? The second query is too complicated to simply reduce its complexity. But I suppose you would be able to get the same behavior if you took a machine like mine - a laptop with 8 GB RAM and an Intel dual core processor running at 1,8 GHz and Win 8.1. Possibly reduce the RAM by 4 GB, or run a VMWare virtual machine on it (what I cannot do here because Microsoft's Hypervisor is installed), and then experiment using your tools at hand. I had a Microsoft VS 2013 Community Edition installed, several Eclipse versions, and Node.js (none of these running when the problem showed up). No Microsoft Office. Unfortunately, I cannot invest to much time investigating the issue since I have a lot of other things to do, but let me know anyway if I can give you further information. At first sight, I liked BaseX very much for its simple installation and ease of use, its small footprint and its well-thought GUI, but I will not be able to use it in a project if I cannot rely on the results it delivers. Kind regards, Goetz P.S. Here the query plan for the second query; you will see that the $ti variable is not resolved appropriately (line 37 of output). BEGIN_OUTPUT Compiling: - inlining $retVal_5 - simplifying flwor expression - pre-evaluating doc(C:\test\Labels.xml) - rewriting (compare(@*:label, $key_0) = 0) - rewriting (compare(@*:LG, DE) = 0) - inlining $retVal_1 - simplifying flwor expression - rewriting (compare(@*:label, $key_2) = 0) - rewriting (compare(@*:LG, $lang_3) = 0) - rewriting (tokenize($nodocojs_9, -))[position() = 2] - inlining local:slz#1 - inlining $arg_15 - simplifying flwor expression - rewriting (tokenize($nodocojs_9, /))[position() = 1] - removing unknown element/attribute text - pre-evaluating $ti_7/*:TI_TEXT/() - inlining local:getLabel#1 - inlining $key_16 - simplifying flwor expression - inlining local:getLabel2#2 - removing redundant $lang_18 as xs:string cast. - inlining $key_17 - inlining $lang_18 - removing unknown element/attribute TERM - pre-evaluating document-node {Labels.xml}/*:LABELS/*:LABEL[compare(@*:label, _and) = 0.0]/()/text() - simplifying flwor expression - inlining $pd_8 - inlining $nd_10 - inlining $oj_11 - inlining $ds_12 - inlining $dt_13 - inlining $hd_14 Query: declare variable $lang as xs:string := 'DE'; declare variable $labels := doc('C:\test\Labels.xml'); declare function local:getLabel( $key ) { let $retVal := $labels/LABELS/LABEL[compare(@label, $key) = 0]/TERM[compare(@LG, $lang) = 0]/text() return $retVal }; declare function local:getLabel2( $key, $lang as xs:string ) { $labels/LABELS/LABEL[compare(@label, $key) = 0]/TERM[compare(@LG, $lang) = 0]/text() }; declare function local:slz( $arg ) { let $retVal := xs:string(xs:double($arg)) return $retVal }; for $n in(/TED_EXPORT) let $ti := $n/TRANSLATION_SECTION/ML_TITLES/ML_TI_DOC[@LG=$lang] let $pd := $n/CODED_DATA_SECTION/REF_OJS/DATE_PUB/text() let $nodocojs := $n/CODED_DATA_SECTION/NOTICE_DATA/NO_DOC_OJS let $nd := concat(local:slz(tokenize
Re: [basex-talk] Simple xQuery functions do not work as expected
Sorry--just realized I confused the threads. Ignore me. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/25/15, 1:19 PM, Eliot Kimber ekim...@contrext.com wrote: Note that I'm not using the GUI, I'm using the basexclient command via a bash script. I have the basexserver running as a service (meaning it's a background task under OS X and then I'm using git commit hooks to run bash scripts that call the basexclient command. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/25/15, 12:54 PM, Christian Grün christian.gr...@gmail.com wrote: Phew, it's really difficult to say anything on this.As long as we cannot reproduce this on at least a second machine, it's hardly possible to tell what's happening here. As indicated, I haven't experienced a similar behavior so far, neither on my own system, which is pretty similar to yours, nor on any server machines in our productive instances. But of course I couldn't seriously recommend to go on with your current setup as long as the problem persists. If you want to continue working on this, maybe some more questions: 1. Does this only happen in the GUI or also on command line? 2. Do you have any other JVM instance running beside the single BaseX GUI instance? 3. Do you get the same error when restarting the GUI? Christian On Wed, Mar 25, 2015 at 3:03 PM, Goetz Heller hel...@hellerim.de wrote: Hi Christian, this time, both the function you've seen already and a different query which tries to extract data from a larger Xml file fail with the behavior described. In both query execution plans you can see that there is some optimization which places an empty pair of parentheses into the optimized query where something different should be. It looks as if an shortage of some - memory? - resources throws an exception which is silently caught somewhere. I know from my own programming experience that it happens like that: a boring nuisance is caught in the middle of something nice being developed so you neutralize this effect temporarily - and then simply forget it. Is it possible that something similar happened in this case? The second query is too complicated to simply reduce its complexity. But I suppose you would be able to get the same behavior if you took a machine like mine - a laptop with 8 GB RAM and an Intel dual core processor running at 1,8 GHz and Win 8.1. Possibly reduce the RAM by 4 GB, or run a VMWare virtual machine on it (what I cannot do here because Microsoft's Hypervisor is installed), and then experiment using your tools at hand. I had a Microsoft VS 2013 Community Edition installed, several Eclipse versions, and Node.js (none of these running when the problem showed up). No Microsoft Office. Unfortunately, I cannot invest to much time investigating the issue since I have a lot of other things to do, but let me know anyway if I can give you further information. At first sight, I liked BaseX very much for its simple installation and ease of use, its small footprint and its well-thought GUI, but I will not be able to use it in a project if I cannot rely on the results it delivers. Kind regards, Goetz P.S. Here the query plan for the second query; you will see that the $ti variable is not resolved appropriately (line 37 of output). BEGIN_OUTPUT Compiling: - inlining $retVal_5 - simplifying flwor expression - pre-evaluating doc(C:\test\Labels.xml) - rewriting (compare(@*:label, $key_0) = 0) - rewriting (compare(@*:LG, DE) = 0) - inlining $retVal_1 - simplifying flwor expression - rewriting (compare(@*:label, $key_2) = 0) - rewriting (compare(@*:LG, $lang_3) = 0) - rewriting (tokenize($nodocojs_9, -))[position() = 2] - inlining local:slz#1 - inlining $arg_15 - simplifying flwor expression - rewriting (tokenize($nodocojs_9, /))[position() = 1] - removing unknown element/attribute text - pre-evaluating $ti_7/*:TI_TEXT/() - inlining local:getLabel#1 - inlining $key_16 - simplifying flwor expression - inlining local:getLabel2#2 - removing redundant $lang_18 as xs:string cast. - inlining $key_17 - inlining $lang_18 - removing unknown element/attribute TERM - pre-evaluating document-node {Labels.xml}/*:LABELS/*:LABEL[compare(@*:label, _and) = 0.0]/()/text() - simplifying flwor expression - inlining $pd_8 - inlining $nd_10 - inlining $oj_11 - inlining $ds_12 - inlining $dt_13 - inlining $hd_14 Query: declare variable $lang as xs:string := 'DE'; declare variable $labels := doc('C:\test\Labels.xml'); declare function local:getLabel( $key ) { let $retVal := $labels/LABELS/LABEL[compare(@label, $key) = 0]/TERM[compare(@LG, $lang) = 0]/text() return $retVal }; declare function local:getLabel2( $key, $lang as xs:string ) { $labels/LABELS/LABEL[compare(@label, $key) = 0]/TERM[compare(@LG, $lang) = 0]/text() }; declare function local:slz( $arg ) { let $retVal := xs:string(xs:double($arg)) return $retVal }; for $n in(/TED_EXPORT) let $ti := $n
Re: [basex-talk] Configuration File for Basexclient Connection Details?
Hmm, the behavior is a little odd. This is under OS X. Under Windows it seems to work as expected: my user-specific .basex file is used. So maybe this is an OS X issue? I discovered that I did have a .basex file in both basex/ dir and in my home dir. The one in the basex/ directory had these entries: USER = PASSWORD = SERVERHOST = So I deleted them to see if the same entries in the ~/.basex file would get used. However, when I ran basexclient, I got this: Contrext01:dfst-sample-project ekimber$ basexclient /Users/ekimber/apps/basex/.basex: writing new configuration file. Username: And editing the basex/.basex file showed that in fact the empty USER and PASSWORD entries had been restored. Likewise, if I delete basex/.basex, I get the same result. So it looks like even when there is a ~/.basex file it's not being used. I also tried putting a property in ~/.basex and it was not set. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/25/15, 12:39 PM, Christian Grün christian.gr...@gmail.com wrote: Looking through the docs and trying to tests locally, it appears that the basexclient command does not use the USER or PASSWORD fields in .basex when run. That is, given this in my ~/.basex file: USER = admin PASSWORD = admin It should actually do so (I just tried, and it worked for me). Did you possibly have another .basex file located in the directory where you started basexserver from? Just in case you haven't found this by yourself: [1] describes which directories are checked for .basex files when starting BaseX. Hope this helps, Christian [1] http://docs.basex.org/wiki/Configuration#Home_Directory
Re: [basex-talk] Failure on Second Update to Database
Cool--always happy to reveal a bug. Let me know if I can help in testing a fix. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/25/15, 1:23 PM, Christian Grün christian.gr...@gmail.com wrote: Pardon me… I just tried to call the gist in the browser. Using git clone works as you said. I managed to reproduce the problem, and it's a clear bug [1]. We'll do our best to fix it as soon as possible. Best, Christian [1] https://github.com/BaseXdb/basex/issues/1112 On Wed, Mar 25, 2015 at 6:52 PM, Eliot Kimber ekim...@contrext.com wrote: Are you behind a VPN? I just tried to clone it from a machine that had a VPN connection going but it failed. When I turned the VPN off then the clone succeeded. Unless it's an issue with GitHub itself. I've but the data on Dropbox here: https://dl.dropboxusercontent.com/u/20078596/gist-basex-failure.zip Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/25/15, 12:40 PM, Christian Grün christian.gr...@gmail.com wrote: https://gist.github.com/dc9f6d55d891b06ecae9.git Hm, it still gives me 404.. I also tried to download it from a machine with a different IP (both in Germany). Using http or https makes no difference either! I just verified that it's not private and that I could clone it (at least with my GitHub credentials). Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/25/15, 12:28 PM, Christian Grün christian.gr...@gmail.com wrote: Eliot, Thanks for reporting this. The gist does not seem to exist (anymore); could you please check it again? Thanks in advance, Christian On Wed, Mar 25, 2015 at 3:58 PM, Eliot Kimber ekim...@contrext.com wrote: I'm getting a failure on the second commit of XML data via the basexclient command (from my git hook code). I've created a gist here: https://gist.github.com/dc9f6d55d891b06ecae9.git with relevant logs and the data/ directory. The exception reported is: == Contrext01:dfst-sample-project ekimber$ git add .;git commit -m updated test file Running: basexclient -c CHECK\ dfst_master\;\ OPEN\ dfst_master\;\ REPLACE\ docs/topic-01.dita\ /Users/ekimber/workspace-dfst/dfst-sample-project/docs/topic-01.dita -U admin -P admin -p 1984 -n localhost Running: basexclient -c CHECK\ dfst_master\;\ OPEN\ dfst_master\;\ REPLACE\ dfst/metadata.xml\ \dfst_metadata\\gitstate\\branch\master\/branch\\commit\8d2 64 e1 e7 3d5596758a069a4355920c448a5b41d\/commit\\/gitstate\\/dfst_metada ta \ -U admin -P admin -p 1984 -n localhost Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 8.0.3 Java: Oracle Corporation, 1.7.0_65 OS: Mac OS X, x86_64 Stack Trace: java.lang.RuntimeException: Data Access out of bounds: - pre value: 21 - #used blocks: 1 - #total locks: 1 - access: 0 (1 0] at org.basex.util.Util.notExpected(Util.java:60) at org.basex.io.random.TableDiskAccess.cursor(TableDiskAccess.java:462) at org.basex.io.random.TableDiskAccess.read1(TableDiskAccess.java:148) at org.basex.data.Data.kind(Data.java:304) at org.basex.data.atomic.Replace.getInstance(Replace.java:42) at org.basex.data.atomic.AtomicUpdateCache.addReplace(AtomicUpdateCache. ja va :9 5) at org.basex.core.cmd.Replace.replace(Replace.java:95) at org.basex.core.cmd.Replace.run(Replace.java:57) at org.basex.core.Command.run(Command.java:379) at org.basex.core.Command.execute(Command.java:95) at org.basex.server.ClientListener.run(ClientListener.java:146) [master 8d264e1] updated test file 1 file changed, 1 insertion(+), 1 deletion(-) Contrext01:dfst-sample-project ekimber$ = The first update succeeds, the second one fails. The second one is committing literal XML on the command while the first is committing a file on the file system. This behavior is consistent: if I drop the database, restart the server, the repeat this test, I get the same failure. Is this my user error or a bug? Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] DITA for Small Teams Git-to-BaseX Hooks
I think I have my git hooks for loading XML into BaseX working well enough (there's more refinement to do but the minimum functionality is there). The code is available here: https://github.com/dita-for-small-teams/dfst-git-commit-hooks/tree/develop (That's on the develop branch) The functionality works as follows: - On checkout of a branch, see if a corresponding database is in BaseX, if not, create the database (named for the repo directory and branch), load all XML files (*.xml,*.dita,*.ditamap), and capture the git branch name and commit hash. If the database exists and the commit is not the current commit, reflect changes from the DB's commit and the branch's commit. - On commit or merge, update the database to reflect the changes between HEAD and HEAD^ I do not react (yet) to branch delete, but BaseX database maintenance is easy enough that it would be hard (e.g., just drop the database for that branch). I think this set hooks is sufficient to ensure that the BaseX databases will always be in sync with the git repo. I'm using the approach of one database per branch so that I can then implement branch-aware queries in the repo. This is essential for link management where you want to be able to view the documents through specific branch such that non-version-specific links will resolve to the version visible on that branch (this approach doesn't handle being able to resolve to older versions--that would require maintaining per-commit databases, although that might be necessary for certain use cases, hmmm). Thanks for all the help here, especially Charles Duffy's help with bash script coding. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Failure on Second Update to Database
Is there a workaround for this bug? Do I understand the issue that this only occurs if the first document has a namespace declaration? Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/25/15, 1:44 PM, Christian Grün christian.gr...@gmail.com wrote: Cool--always happy to reveal a bug. ;) And I'm always happy to receive bug reports, because it feels like just being one step closer to a bug-free release. Let me know if I can help in testing a fix. Thanks for the offer. I noticed that the error was caused by an old assertion in the code.. https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/ basex/io/random/TableDiskAccess.java#L323-L325 I removed this check, and not any of our W3 and JUnit update tests complained. It's tempting to simply leave it like that, but I'll have a closer look at this tomorrow. I'll tell you once this can be tested. Christian
[basex-talk] Optimizing Element Access By Attribute Value Matching
DITA defines the notion of layered hierarchy of element types, where every DITA-defined element is either a base type or a specialized type derived from some base type. The type hierarchy of each element is specified by a @class attribute that lists the ancestry and leaf type of the element. For example, the element type concept is a specialization of the base type topic and so has a @class value of - topic/topic concept/concept . Each blank-delimited term is a module name/element name pair. Processing in DITA is specialization aware if selection of elements is in terms of a @class token rather than concrete element type. For example, you might apply processing to topics of any type by matching on *[contains(@class, ' topic/topic ')], which will match all DITA topics, regardless of their specialized type. The challenge this presents in a database context is optimizing finding of things based on these @class values. For large repositories an XQuery like //*[contains(@class, ' topic/topic ')] is going to be quite slow as it requires a string comparison of every @class value. Even if there is an attribute value index it will still be slow. The obvious solution would be to index by @class token, e.g., an index where keys are topic/topic, topic/p, etc. Is there a way to construct such an index in BaseX? Is there a better to address type of string-match-based lookup? Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] Best Forum for Generate XQuery How To Discussions?
What is the best forum for asking general how-to XQuery questions that are not BaseX specific? Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] Separate Databases vs. Directories Within One Databse?
In the discussion of adding metadata to a bunch of files Christian points out that you can both limit queries to directories within a single database or apply a query to multiple databases. My question: when or why would you prefer one approach over the other? In my case I'm using BaseX to reflect the XML contents of git repositories. My current approach is to create a separate database for each repo/branch pair, my reasoning being that that makes it easiest to limit queries to just that branch. Because the BaseX data is intended to be a read-only reflecting of the git-managed source, it also makes it easy to clear the data for a branch if it's gotten out of sync (or I suspect it's gotten out of sync) by simply dropping the database. I have complete control over the queries (through a library of functions that understand the git nature of the databases), so I could just as easily use a single database with subdirectories that reflect the repos and branches. In this scenario, as an example, is there any compelling reason to use one approach or the other? I like having one database per branch because that seems like a natural mapping that generally keeps things simple and more or less obvious (e.g., doing list will show the list of databases, which reflect the repo and branch names in their names). In this application the scale will usually be relatively small: 1000s or 10s of 1000s of individual documents in any given branch but the querying and indexing, which supports maintaining knowledge of the links within the XML content, could get intense. Cheers, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Separate Databases vs. Directories Within One Databse?
Christian, That is helpful. Basically you've confirmed my initial analysis that because BaseX databases are light weight that keeping things simple is the most appropriate choice. If I was doing things at scale of course I'd do performance testing to see where the bottlenecks are, but that is not a concern for what I'm doing now. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 5/16/15, 5:10 AM, Christian Grün christian.gr...@gmail.com wrote: Hi Eliot, As usual, there is no simple answer to such a question. However, I can say that sounds like a good choice to use one BaseX database per git repository. In contrast to many other dbms, databases in BaseX are pretty light-weight containers, and in some of our own use cases we even create one database per document. If you have hundreds or thousands of databases, then it may be reasonable to merge them into single units, because it may take too much time to access the database directories in the file system. Some file systems are better than others in handling large amounts of files and directories on the same level. The same observation applies if you frequently write queries that access more than one database: It's always faster to open a single database (but usually you will only notice this when opening a larger number of databases). Hope this helps, Christian On Thu, May 14, 2015 at 3:57 PM, Eliot Kimber ekim...@contrext.com wrote: In the discussion of adding metadata to a bunch of files Christian points out that you can both limit queries to directories within a single database or apply a query to multiple databases. My question: when or why would you prefer one approach over the other? In my case I'm using BaseX to reflect the XML contents of git repositories. My current approach is to create a separate database for each repo/branch pair, my reasoning being that that makes it easiest to limit queries to just that branch. Because the BaseX data is intended to be a read-only reflecting of the git-managed source, it also makes it easy to clear the data for a branch if it's gotten out of sync (or I suspect it's gotten out of sync) by simply dropping the database. I have complete control over the queries (through a library of functions that understand the git nature of the databases), so I could just as easily use a single database with subdirectories that reflect the repos and branches. In this scenario, as an example, is there any compelling reason to use one approach or the other? I like having one database per branch because that seems like a natural mapping that generally keeps things simple and more or less obvious (e.g., doing list will show the list of databases, which reflect the repo and branch names in their names). In this application the scale will usually be relatively small: 1000s or 10s of 1000s of individual documents in any given branch but the querying and indexing, which supports maintaining knowledge of the links within the XML content, could get intense. Cheers, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] Way to React To Updates Using XQuery?
As part of my DITA link management system, I need to create and update various indexes and data structures whenever specific documents are added or updated. I was looking for some sort of trigger or event handler by which I would register an XQuery function and the system would call it on add, update, or delete, but I didn't see anything like that in the docs. Since my updates are, in my case, being driven by git commit hooks I could have the commit hook call a function following updates and pass that function the list of added, updated, and deleted docs. But that would not support the case where documents were loaded by some other route. Is such a mechanism available? Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Way to React To Updates Using XQuery?
I will track the conversation. In thinking about my requirements more I realized that it probably makes sense to build my indexes and cached data structures on demand. At the time a query is made I can determine if the index or cache I need A) already exists and B) is older than the timestamp(s) of the documents involved. If each cache records the paths of the documents that contribute to it then checking that the cache is not older than any contributing docs and that no contributing doc has been deleted should be fast enough. If I have a general updateCaches() function, I can simply call that function after any load or update operation from my git hooks. For my current use case that will be as good as having some sort of trigger feature. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/16/15, 12:03 AM, Christian Grün christian.gr...@gmail.com wrote: Is such a mechanism available? Such a mechanism is indeed planned to be added in a future versoin. It might take a while, though, to make this happen, as we first want to sort out what are the most important use cases (e.g.: do we want synchronous or asynchronous triggers?). You are invited in following the discussion at GitHub: https://github.com/BaseXdb/basex/issues/1082
Re: [basex-talk] Mystery Failure on REPLACE but Not ADD
Hmm, the failure is reliable on my system. I also discovered that having a file that is not XML and attempting to load it as XML will fail. I had a file named file1.xml with the content added (created by some accident of my flailing with bash no doubt) in the directory I was loading using a filter on *.xml. I got the same failure. When I removed this file, the failure went away. It could be an issue with Java versions or another OS X-specific issue, perhaps. Here's what I'm running: Contrext01:dfst-sample-project ekimber$ java -version java version 1.7.0_65 Java(TM) SE Runtime Environment (build 1.7.0_65-b17) Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode) Unfortunately I don't really have time just now to spend trying to debug this more deeply but it must be something in the DTD-aware XML parsing tool chain that fails before the skip corrupt = true logic comes into play. I'll log an issue for it and try to track it down once I have a bit more time. Definitely don't want to have this sort of failure hanging about if we can avoid it. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/13/15, 8:41 AM, Christian Grün christian.gr...@gmail.com wrote: Hi Eliot, Having the CATPATH set or not set does not affect the failure. Do you possibly mean QUERYPATH or any other BaseX option? That is, this fails: basexclient -c 'CHECK dfst_dfst-sample-project_develop; SET DTD true; REPLACE dfst/metadata.xml dfst_metadatagitstatebranchdevelop/branchcommit2df2f9674f7e8a5d 43 865411b193f76b19e9565d/commit/gitstate/dfst_metadata' -U admin -P admin -p 1984 -n localhost Hm, once again it does not fail on my system. What I did was... * download and unzip basex.zip * start bin/basexserver * run your command (on Windows, replacing the single with double quotes) What else do I have to do? C. And this succeeds: basexclient -c 'CHECK dfst_dfst-sample-project_develop; SET DTD false; REPLACE dfst/metadata.xml dfst_metadatagitstatebranchdevelop/branchcommit2df2f9674f7e8a5d 43 865411b193f76b19e9565d/commit/gitstate/dfst_metadata' -U admin -P admin -p 1984 -n localhost Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/12/15, 12:40 PM, Eliot Kimber ekim...@contrext.com wrote: No obvious cause for the failure yet but it feels like a bug that is revealed by a particular configuration organization. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/12/15, 12:19 PM, Eliot Kimber ekim...@contrext.com wrote: It's definitely a function of my local .basex settings. If I revert to the default settings (make the basex install dir .basexhome and remove ~/.basex) then the replace succeeds. Trying to figure out what the setting is that results in the failure. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/12/15, 11:36 AM, Eliot Kimber ekim...@contrext.com wrote: Using the basex Zip distribution for both 8.0.3 and 8.1, I'm getting a failure on REPLACE but not ADD for the same XML data. This command succeeds: basexclient -c CHECK dfst_dfst-sample-project_master; OPEN dfst_dfst-sample-project_master; ADD to dfst/metadata.xml dfst_metadata/ But this command fails basexclient -c CHECK dfst_dfst-sample-project_master; OPEN dfst_dfst-sample-project_master; REPLACE dfst/metadata.xml dfst_metadata/ ... (Line 1): Premature end of file. This definitely worked in the past, so something must have changed on my system. Obviously, the error message is not very helpful in this case. Any idea what the problem might be or how I would diagnose this failure? Thanks, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Mystery Failure on REPLACE but Not ADD
I have isolated the variable to the DTD setting: if I set DTD to true then I get the failure. If it is false, no failure. This is for XML with no associated DTD or grammar of any sort. Having the CATPATH set or not set does not affect the failure. If I add SET DTD false to my command set then the replace succeeds. So this appears to be a bug in replacing (but not adding) DTD-less documents when DTD is set to true. That is, this fails: basexclient -c 'CHECK dfst_dfst-sample-project_develop; SET DTD true; REPLACE dfst/metadata.xml dfst_metadatagitstatebranchdevelop/branchcommit2df2f9674f7e8a5d43 865411b193f76b19e9565d/commit/gitstate/dfst_metadata' -U admin -P admin -p 1984 -n localhost And this succeeds: basexclient -c 'CHECK dfst_dfst-sample-project_develop; SET DTD false; REPLACE dfst/metadata.xml dfst_metadatagitstatebranchdevelop/branchcommit2df2f9674f7e8a5d43 865411b193f76b19e9565d/commit/gitstate/dfst_metadata' -U admin -P admin -p 1984 -n localhost Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/12/15, 12:40 PM, Eliot Kimber ekim...@contrext.com wrote: No obvious cause for the failure yet but it feels like a bug that is revealed by a particular configuration organization. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/12/15, 12:19 PM, Eliot Kimber ekim...@contrext.com wrote: It's definitely a function of my local .basex settings. If I revert to the default settings (make the basex install dir .basexhome and remove ~/.basex) then the replace succeeds. Trying to figure out what the setting is that results in the failure. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/12/15, 11:36 AM, Eliot Kimber ekim...@contrext.com wrote: Using the basex Zip distribution for both 8.0.3 and 8.1, I'm getting a failure on REPLACE but not ADD for the same XML data. This command succeeds: basexclient -c CHECK dfst_dfst-sample-project_master; OPEN dfst_dfst-sample-project_master; ADD to dfst/metadata.xml dfst_metadata/ But this command fails basexclient -c CHECK dfst_dfst-sample-project_master; OPEN dfst_dfst-sample-project_master; REPLACE dfst/metadata.xml dfst_metadata/ ... (Line 1): Premature end of file. This definitely worked in the past, so something must have changed on my system. Obviously, the error message is not very helpful in this case. Any idea what the problem might be or how I would diagnose this failure? Thanks, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] HTTP Server Not Serving Any RESTXQ Results
I'm trying to set up some basic Web services using the basexhttp server. I'm on OS X. If I understand the docs, after running the basexhttp command, I should get the sample HTML start page returned by page:start() function defined in the restxq.xqm file when I go to http://localhost:8984 However I'm not getting that page, I'm getting a 404 response from the Jetty server: HTTP ERROR 404Problem accessing /. Reason: No function found that matches the request. Powered by Jetty:// I verified that restxq.xqm is a loaded module: repo list NameVersion Type Path - -Internal .DS_Store restxq -Internal restxq.xqm 2 package(s). I'm sure this worked when I first started playing with BaseX and installed it initially. I tried both 8.0.3 (the first version I used) and 8.1 with no difference. I must have done something to the configuration to cause this problem but I don't see any obvious mistake on my part: I haven't knowingly changed the web.xml or jetty.xml files and web.xml seems to clearly enable handling of root URLs by the RESTXQ handler: servlet servlet-nameRESTXQ/servlet-name servlet-classorg.basex.http.restxq.RestXqServlet/servlet-class load-on-startup1/load-on-startup /servlet servlet-mapping servlet-nameRESTXQ/servlet-name url-pattern/*/url-pattern /servlet-mapping The WebDAV service does work: I can retrieve files from specific databases using the WeDAV url. Any suggestions on where my configuration might be bad? Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] HTTP Server Not Serving Any RESTXQ Results
Christian, I'm using the Zip distribution. The REST service works and WebDAV works, it is only RESTXQ that doesn't appear to be correctly configured. So either I'm missing some essential configuration detail or there's an OS X-specific problem that I'm tripping over. Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/11/15, 6:17 PM, Christian Grün christian.gr...@gmail.com wrote: Hi Eliot, If I understand the docs, after running the basexhttp command, I should get the sample HTML start page returned by page:start() function defined in the restxq.xqm file when I go to http://localhost:8984 However I'm not getting that page, I'm getting a 404 response from the Jetty server: Which of our distributions (zip, war, ...) have you been using so far? I verified that restxq.xqm is a loaded module: repo list NameVersion Type Path - -Internal .DS_Store restxq -Internal restxq.xqm restxq.xqm must indeed be located in the webapp directory, and not in the repository. Maybe you can check out the BaseX ZIP archive, it should run out of the box. Hope this helps, Christian
Re: [basex-talk] HTTP Server Not Serving Any RESTXQ Results
I found my configuration error: bad WEBPATH in my user-specific .basex file. I had, for whatever reason (maybe through some bad copying at some point), set the WEBPATH to a bogus directory. Once I set it to the actual webapp location then everything works as expected. A bit of startup diagnostics here would have helped, e.g. WEBPATH directory 'foo/bar' does not exist. I'll create an issue and try to circle back to implementing a fix once my current crunch has passed. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/12/15, 8:33 AM, Eliot Kimber ekim...@contrext.com wrote: Christian, I'm using the Zip distribution. The REST service works and WebDAV works, it is only RESTXQ that doesn't appear to be correctly configured. So either I'm missing some essential configuration detail or there's an OS X-specific problem that I'm tripping over. Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/11/15, 6:17 PM, Christian Grün christian.gr...@gmail.com wrote: Hi Eliot, If I understand the docs, after running the basexhttp command, I should get the sample HTML start page returned by page:start() function defined in the restxq.xqm file when I go to http://localhost:8984 However I'm not getting that page, I'm getting a 404 response from the Jetty server: Which of our distributions (zip, war, ...) have you been using so far? I verified that restxq.xqm is a loaded module: repo list NameVersion Type Path - -Internal .DS_Store restxq -Internal restxq.xqm restxq.xqm must indeed be located in the webapp directory, and not in the repository. Maybe you can check out the BaseX ZIP archive, it should run out of the box. Hope this helps, Christian
[basex-talk] Mystery Failure on REPLACE but Not ADD
Using the basex Zip distribution for both 8.0.3 and 8.1, I'm getting a failure on REPLACE but not ADD for the same XML data. This command succeeds: basexclient -c CHECK dfst_dfst-sample-project_master; OPEN dfst_dfst-sample-project_master; ADD to dfst/metadata.xml dfst_metadata/ But this command fails basexclient -c CHECK dfst_dfst-sample-project_master; OPEN dfst_dfst-sample-project_master; REPLACE dfst/metadata.xml dfst_metadata/ ... (Line 1): Premature end of file. This definitely worked in the past, so something must have changed on my system. Obviously, the error message is not very helpful in this case. Any idea what the problem might be or how I would diagnose this failure? Thanks, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Mystery Failure on REPLACE but Not ADD
It's definitely a function of my local .basex settings. If I revert to the default settings (make the basex install dir .basexhome and remove ~/.basex) then the replace succeeds. Trying to figure out what the setting is that results in the failure. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/12/15, 11:36 AM, Eliot Kimber ekim...@contrext.com wrote: Using the basex Zip distribution for both 8.0.3 and 8.1, I'm getting a failure on REPLACE but not ADD for the same XML data. This command succeeds: basexclient -c CHECK dfst_dfst-sample-project_master; OPEN dfst_dfst-sample-project_master; ADD to dfst/metadata.xml dfst_metadata/ But this command fails basexclient -c CHECK dfst_dfst-sample-project_master; OPEN dfst_dfst-sample-project_master; REPLACE dfst/metadata.xml dfst_metadata/ ... (Line 1): Premature end of file. This definitely worked in the past, so something must have changed on my system. Obviously, the error message is not very helpful in this case. Any idea what the problem might be or how I would diagnose this failure? Thanks, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Documentation on Module Imports Missing a Detail
Yes, I can update the documentation. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/12/15, 11:56 AM, Christian Grün christian.gr...@gmail.com wrote: Hi Eliot, I agree, the documentation could surely be more precise here. Would you possibly be interested in editing it? We are always interested in external contributions (the only reason for enforcing a registration is that we had to stop spamming activities in the past). In the case of URNs, the token separator is :, so you could treat each token as a directory name. Sounds reasonable; I've added an issue for that (https://github.com/BaseXdb/basex/issues/1124). Cheers, Christian
Re: [basex-talk] Mystery Failure on REPLACE but Not ADD
No obvious cause for the failure yet but it feels like a bug that is revealed by a particular configuration organization. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/12/15, 12:19 PM, Eliot Kimber ekim...@contrext.com wrote: It's definitely a function of my local .basex settings. If I revert to the default settings (make the basex install dir .basexhome and remove ~/.basex) then the replace succeeds. Trying to figure out what the setting is that results in the failure. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/12/15, 11:36 AM, Eliot Kimber ekim...@contrext.com wrote: Using the basex Zip distribution for both 8.0.3 and 8.1, I'm getting a failure on REPLACE but not ADD for the same XML data. This command succeeds: basexclient -c CHECK dfst_dfst-sample-project_master; OPEN dfst_dfst-sample-project_master; ADD to dfst/metadata.xml dfst_metadata/ But this command fails basexclient -c CHECK dfst_dfst-sample-project_master; OPEN dfst_dfst-sample-project_master; REPLACE dfst/metadata.xml dfst_metadata/ ... (Line 1): Premature end of file. This definitely worked in the past, so something must have changed on my system. Obviously, the error message is not very helpful in this case. Any idea what the problem might be or how I would diagnose this failure? Thanks, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] Documentation on Module Imports Missing a Detail
I was trying to have a RESTXQ module import a library module stored in the configured REPO directory. I copied my module into that directory and verified that it was listed: NameVersion Type Path - basex-utils -Internal basex-utils.xqm dita-utils -Internal dita-utils.xqm linkmgmt-utils -Internal linkmgmt-utils.xqm relpath-utils -Internal relpath-utils.xqm 4 package(s). Where I want to import the basex-utils module. The docs say: lq Repository modules are stored in a directory named BaseXRepo or repo, which resides in your home directory http://docs.basex.org/wiki/Configuration#Home_Directory. XQuery modules can be manually copied to the repository directory or installed and deleted via commands http://docs.basex.org/wiki/Repository#Commands. If a modules is placed in the repository, there is no need to specify a location. /lq However, this import failed to find the module: import module namespace bxutil=http://dita-for-small-teams.org/xquery/modules/basex-utils;; But, after I did an explicit REPO INSTALL on the module, BaseX put it in a directory that mirrors the namespace URL structure. At that point the import succeeded. Thus, it appears that the module needs to be installed or otherwise stored in a directory that mirrors the namespace declaration--it is not sufficient to simply copy the module into the repo directory. I think this assumption that the namespace URI will be a URL is not valid. If I create a module that uses a URN for the namespace URI, I get this failure on REPO INSTALL: [bxerr:BXRE0002] URI is invalid or has no path component: 'urn:names:xquerymodule:testurn'. In the case of URNs, the token separator is :, so you could treat each token as a directory name. Cheers, Eliot
Re: [basex-talk] URLs for BaseX Documents and resolve-uri()
BTW, my workaround for now is to check to see if base-uri() returned an absolute URL and if it did not, I use my existing code for constructing URLs to simply combine the relative URL with the parent of the result of base-uri(), otherwise I use resolve-uri(). That's a sufficiently-general solution that is not explicitly BaseX specific. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/17/15, 12:07 PM, Christian Grün christian.gr...@gmail.com wrote: Hi Eliot, This fails under BaseX: [FORG0002] Base URI is not absolute: dfst^dfst-sample-project^develo I think it also fails if the base URI starts with a slash. Try this for example: resolve-uri('abc.xml', '/path/def.xml') What result you would expect from this query, given that it did not raise an error? Using string operations could be an alternative... replace($baseURI, '/[^*]+$', $topicResourcePart) ...but I am not sure if this fulfills all needs. Christian Obviously, this way of representing document URIs internally is not something that could be easily changed but it is definitely a problem in terms of the expectations of URI handling by built-in XQuery functions. Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] URLs for BaseX Documents and resolve-uri()
Yes, resolve-uri('foo', '/bar') also fails. If I do base-uri('/foo') I get file:/foo, which is a bit unexpected since I'm in the context of the BaseX system and not in the context of the file system (at least that's why my head thinks: BaseX itself may have different ideas). I think what I expect is either that BaseX interprets all URIs not prefixed with file:/ as relative to the BaseX repository, so rather than dbname/path/to/doc the URL format would be/dbname/path/to/doc or there is a BaseX-specific scheme that is used with all absolute URIs, e.g. basex:/dbname/path/to/doc. That would remove any potential ambiguity about the intent of a given URL. Thus, document-uri(root($node)) would return either /foo/bar or basex:/foo/bar, but not foo/bar. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/17/15, 12:07 PM, Christian Grün christian.gr...@gmail.com wrote: Hi Eliot, This fails under BaseX: [FORG0002] Base URI is not absolute: dfst^dfst-sample-project^develo I think it also fails if the base URI starts with a slash. Try this for example: resolve-uri('abc.xml', '/path/def.xml') What result you would expect from this query, given that it did not raise an error? Using string operations could be an alternative... replace($baseURI, '/[^*]+$', $topicResourcePart) ...but I am not sure if this fulfills all needs. Christian Obviously, this way of representing document URIs internally is not something that could be easily changed but it is definitely a problem in terms of the expectations of URI handling by built-in XQuery functions. Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] URLs for BaseX Documents and resolve-uri()
I'm migrating XSLT functions for working with DITA documents to XQuery. As part of this function package I have functions that resolve URI references from one document to another. This involves creating absolute URLs from relative URLs using a document or element as the base URI context. The problem I'm running into is that the URLs used within BaseX are not absolute (that is, they don't start with /. For example, document-uri(root($node)) returns: dfst^dfst-sample-project^develop/docs/tests/complex_map/complex_map.ditamap not /dfst^dfst-sample-project^develop/docs/tests/complex_map/complex_map.ditama p This causes a problem for resolve-uri(), used like this: resolve-uri($topicResourcePart, base-uri($topicref)) This fails under BaseX: [FORG0002] Base URI is not absolute: dfst^dfst-sample-project^develo These libraries are intended to be generic XQuery so I'm trying to avoid having any BaseX-specific logic in these functions. Is there workaround or other solution to this? Obviously, this way of representing document URIs internally is not something that could be easily changed but it is definitely a problem in terms of the expectations of URI handling by built-in XQuery functions. Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] How To Reflect XML Source in HTML Output?
I'm use the RESTXQ stuff (very nice) to build a little DITA link management application. In the service of this I need to sometimes show the raw XML source of the docs in the repo. I looked at the DBA Web app and it looks like it's using some Javascript for this and I didn't see any obvious solution in the various BaseX packages, but I could have missed something. I hacked a quick little typeswitch transform just to get something but it doesn't preserve newlines from the source XML (but I notice the DBA Web app does, so they must be preserved under the covers). Is there a built-in way to generate escaped XML for use in e.g., HTML pre or is there some silly thing I need to do in my typeswitch to make sure the newlines make it to the HTML result? Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] Working With Maps Using FLOWR Expressions?
I think I'm missing something fundamental but I haven't been able to find a relevant example of what I'm trying to do. I suspect I'm being derailed by procedural brain damage. I have a function that takes as input a map and will return a new map reflecting updates applied to input map. The business logic of this function is to iterate over some sequence and add or update map items as needed, e.g.: let $map := map { } for $key in ('A', 'B') return map:put($map, 'A', 'somevalue') The problem is that each interation of the for loop returns a new map. In the context of a FLOWR expression I'm not seeing how to effectively update the same map instance. What fundamental aspect of map manipulation am I missing? Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Working With Maps Using FLOWR Expressions?
Of course. Must be my lack of sleep that kept me from seeing that solution :-) Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/18/15, 11:43 AM, Andy Bunce bunce.a...@gmail.com wrote: Hi Eliot, Take a look at map:entry and map:merge[1] let $map:=map{a:old,x:43} let $seq:=(a,b) return map:merge(($map, for $item in $seq return map:entry($item,somevalue) )) /Andy [1] http://docs.basex.org/wiki/Map_Module#map:entry On 18 April 2015 at 17:20, Eliot Kimber ekim...@contrext.com wrote: I think I'm missing something fundamental but I haven't been able to find a relevant example of what I'm trying to do. I suspect I'm being derailed by procedural brain damage. I have a function that takes as input a map and will return a new map reflecting updates applied to input map. The business logic of this function is to iterate over some sequence and add or update map items as needed, e.g.: let $map := map { } for $key in ('A', 'B') return map:put($map, 'A', 'somevalue') The problem is that each interation of the for loop returns a new map. In the context of a FLOWR expression I'm not seeing how to effectively update the same map instance. What fundamental aspect of map manipulation am I missing? Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] URLs for BaseX Documents and resolve-uri()
Related question: is there a preferred way to determine from within an XQuery function that the current XQuery engine is BaseX? I know I could look to see if a particular BaseX-specific function exists but I was looking for something either more obvious or more general from an XQuery standpoint. Cheers, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/17/15, 10:51 AM, Eliot Kimber ekim...@contrext.com wrote: I'm migrating XSLT functions for working with DITA documents to XQuery. As part of this function package I have functions that resolve URI references from one document to another. This involves creating absolute URLs from relative URLs using a document or element as the base URI context. The problem I'm running into is that the URLs used within BaseX are not absolute (that is, they don't start with /. For example, document-uri(root($node)) returns: dfst^dfst-sample-project^develop/docs/tests/complex_map/complex_map.ditama p not /dfst^dfst-sample-project^develop/docs/tests/complex_map/complex_map.ditam a p This causes a problem for resolve-uri(), used like this: resolve-uri($topicResourcePart, base-uri($topicref)) This fails under BaseX: [FORG0002] Base URI is not absolute: dfst^dfst-sample-project^develo These libraries are intended to be generic XQuery so I'm trying to avoid having any BaseX-specific logic in these functions. Is there workaround or other solution to this? Obviously, this way of representing document URIs internally is not something that could be easily changed but it is definitely a problem in terms of the expectations of URI handling by built-in XQuery functions. Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] How To Reflect XML Source in HTML Output?
serialize() does just what I want: pre { let $map := dftest:testResolveTopicOrMapUri($repo, $branch) return serialize($map) } /pre Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/17/15, 12:43 PM, Christian Grün christian.gr...@gmail.com wrote: declare function dftest:xmlToHtmlCode($nodes as node()*) as node()* { pre{ for $node in $nodes return dftest:nodeToHtml($node) }/pre }; You could serialize the nodes... let $input := x/ return pre{ serialize($input) }/pre ...or directly retrieve them from a file: let $input := fn:unparsed-text(...) return pre{ $input }/pre let $text := 'x/' return pre{ $text }/pre I see.. This won't work, because the children of the pre node That is, I just want to format the XML markup within a pre element. The result could be arbitrary XML elements, not necessarily a complete document. Cheers, E. — Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/17/15, 12:13 PM, Christian Grün christian.gr...@gmail.com wrote: I'm use the RESTXQ stuff (very nice) to build a little DITA link management application. In the service of this I need to sometimes show the raw XML source of the docs in the repo. Did you e.g. try to embed the results of file:read-text() or fn:unparsed-text() ? But I may have got wrong what you are trying to do. I am not sure what part of the DBA you are talking about.. Do you refer to the result view of the query panel? Maybe you need to specify different serialization parameters (i.e., a different output method) as described in our Wiki. On Fri, Apr 17, 2015 at 6:36 PM, Eliot Kimber ekim...@contrext.com wrote: I looked at the DBA Web app and it looks like it's using some Javascript for this and I didn't see any obvious solution in the various BaseX packages, but I could have missed something. I hacked a quick little typeswitch transform just to get something but it doesn't preserve newlines from the source XML (but I notice the DBA Web app does, so they must be preserved under the covers). Is there a built-in way to generate escaped XML for use in e.g., HTML pre or is there some silly thing I need to do in my typeswitch to make sure the newlines make it to the HTML result? Thanks, Eliot — Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Run Queries in Background from RESTXQ App?
I think I wasn't thinking clearly about the problem. If I have a link from one page of the Web app to a REST URL, the results of that URL have to be shown so without using Javascript to submit the URL there will always be a separate result, which I can direct to a separate browser window. So it's not really an issue. And if I wanted things to be a bit more sophisticated I would use Javascript to submit the REST URL and manage the response, however long it took to return. Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 7/4/15, 1:26 PM, Eliot Kimber basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com wrote: I'm creating a RESTXQ-based Web application. One of things I need to do is have a REST method that then does a potentially long-running operation against the database. I don't see an obvious way to to start a long query and not have it block in the this configuration. If I had a standalone client app (e.g., a Java app that provided the UI) then it would be no problem, just create a new connection for the long-running app. But in the context of a RESTXQ application is there a way to do it? I didn't see anything obvious in the docs but I'm not sure I know what to look for. Thanks, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] Run Queries in Background from RESTXQ App?
I'm creating a RESTXQ-based Web application. One of things I need to do is have a REST method that then does a potentially long-running operation against the database. I don't see an obvious way to to start a long query and not have it block in the this configuration. If I had a standalone client app (e.g., a Java app that provided the UI) then it would be no problem, just create a new connection for the long-running app. But in the context of a RESTXQ application is there a way to do it? I didn't see anything obvious in the docs but I'm not sure I know what to look for. Thanks, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] Organizing Large Numbers of Small Documetns?
In my link management application I'm creating a potentially large number of small documents that will serve as a where-used index over the content documents. Are there any potential issues with how these documents are organized into directories within database? The simplest thing would be to put them into one directory but I could also organize them into one directory per document used (which would then result in one directory per document that has links to it, which in a DITA context would normally be one directory for each document as every document save a few should have at least one link to it). So if I had a repository with a 100,000 topics, each with at least one link, is it a problem to have either 100,000 * x files in one directory or 100,000 directories under one parent directory? I didn't see anything in the docs about limitations or practices for storing large numbers of documents. Thanks, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] Returning REST Response From Updating Function
I'm trying to understand how to use db:output() in the context of a REST function that does a bunch of stuff that updates and then wants to return a result. I have my updating functions using db:output() to return XML elements that are log entries, which I have up to now then formatted as HTML for return by the Web app. However, there doesn't appear to be a way to get the stuff returned by db:output before it gets returned to the ultimate caller. Have I missed something? My REST-handling function is declared as %updating: declare %updating %rest:path(/repo/{$repo}/{$branch}/updateLinkManagementIndexes) Which I understand to be a requirement if the function itself calls any updating functions. I tried e.g.: let $result := f:myUpdatingFunction() return db:output(f:formatLogItems($result)) But that results in the no updating functions message on the variable assignment. Is there a way to do what I want? Thanks, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Organizing Large Numbers of Small Documetns?
I'm trying to plan for large scale. For a large company managing many DITA publications you could have 100s of 1000s of individual topics and 1000s of maps that organize those topics. So if there are 100,000 topics and an average of 10 links to each topic (definitely an upper bound in normal practice) then there would be 1 million where-used records (one for each link to a given topic). So if managing 1 million documents is not a problem then I should be fine. Cheers, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com On 7/5/15, 10:43 AM, Christian Grün christian.gr...@gmail.com wrote: Hi Eliot, You shouldn't encounter any problems when storing all XML documents in the same database directory. It may only get an issue if you plan to export the files to disk. I remember one use case in which we stored around 20 millions documents in a single database. It has been a while ago, but as we have generally improved access to single documents in more recent versions of BaseX, I'm optimistic you should be fine. How many documents are you going to store altogether? Christian On Sun, Jul 5, 2015 at 3:57 PM, Eliot Kimber ekim...@contrext.com wrote: In my link management application I'm creating a potentially large number of small documents that will serve as a where-used index over the content documents. Are there any potential issues with how these documents are organized into directories within database? The simplest thing would be to put them into one directory but I could also organize them into one directory per document used (which would then result in one directory per document that has links to it, which in a DITA context would normally be one directory for each document as every document save a few should have at least one link to it). So if I had a repository with a 100,000 topics, each with at least one link, is it a problem to have either 100,000 * x files in one directory or 100,000 directories under one parent directory? I didn't see anything in the docs about limitations or practices for storing large numbers of documents. Thanks, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] question
I believe you need to use the castable as or instance of operators to determine the type, e.g.: let $typeName := if ($somevar castable as xs:integer) then 'integer' else if ($somevar castable as xs:string) then 'string' else 'unknown type' Or something close to that. There is also instance of, which might be I also found this paper, which might be interesting: http://www.balisage.net/Proceedings/vol8/html/Holstege01/BalisageVol8-Holst ege01.html Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 7/29/15, 9:54 AM, Rob Stapper basex-talk-boun...@mailman.uni-konstanz.de on behalf of r.stap...@lijbrandt.nl wrote: Hello, I must be overlooking something, so excuses in advance, but: How can I retrieve the datatype of a variable? TIA, Rob Stapper https://www.avast.com/antivirus Dit e-mailbericht is gecontroleerd op virussen met Avast antivirussoftware. www.avast.com https://www.avast.com/antivirus
Re: [basex-talk] Getting More Diagnostic Info From DBA Query Runner?
I looked at the Javascript for this but my feeble Javascript skills were insufficient to adjust this in the small amount of time I had to work on it. I'll try to look at it again when I can. Cheers, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com On 7/20/15, 10:53 AM, Christian Grün christian.gr...@gmail.com wrote: If you can point me at the relevant code I'll put aside my distaste for Javascript and take a look. It should be here: https://github.com/BaseXdb/basex/blob/master/basex-api/src/main/webapp/dba /static/js.js#L79-L85 Seems most of us share the distaste; otherwise, this code snippet would probalby look more elegant.
Re: [basex-talk] question
I am happy to be corrected. Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 7/29/15, 10:42 AM, Florent Georges fgeor...@gmail.com on behalf of li...@fgeorges.org wrote: Hi, That would base the type annotation on the lexical value. For instance, the string '0' would be labelled as integer (given you test for xs:integer before xs:string): let $var := '0' return if ( $var castable as xs:integer ) then 'integer' else if ( $var castable as xs:string ) then 'string' else 'unknown type' (: - 'integer' :) The operator instance of would solve that problem. But then, for that purpose, there is the typeswitch instruction: typeswitch ( '0' ) case xs:integer return 'integer' case xs:string return 'string' default return 'unknown type' (: - 'string' :) Regards, -- Florent Georges http://fgeorges.org/ http://h2oconsulting.be/ On 29 July 2015 at 17:34, Eliot Kimber ekim...@contrext.com wrote: I believe you need to use the castable as or instance of operators to determine the type, e.g.: let $typeName := if ($somevar castable as xs:integer) then 'integer' else if ($somevar castable as xs:string) then 'string' else 'unknown type' Or something close to that. There is also instance of, which might be I also found this paper, which might be interesting: http://www.balisage.net/Proceedings/vol8/html/Holstege01/BalisageVol8-Hol st ege01.html Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 7/29/15, 9:54 AM, Rob Stapper basex-talk-boun...@mailman.uni-konstanz.de on behalf of r.stap...@lijbrandt.nl wrote: Hello, I must be overlooking something, so excuses in advance, but: How can I retrieve the datatype of a variable? TIA, Rob Stapper https://www.avast.com/antivirus Dit e-mailbericht is gecontroleerd op virussen met Avast antivirussoftware. www.avast.com https://www.avast.com/antivirus
[basex-talk] RESTXQ: Call a Page With A Parameter That Does not Persist in the URL
In the context of my RESTXQ Web service that uses web:redirect, when the process is done I want to return to the original page and on that page display a message indicating that the process succeeded or failed. My initial attempt was simply to redirect to the page and use a query parameter to convey the message. The REST handler then checks for the message and if present shows it in the generated page. The problem I'm running into is that the query parameter persists in the URL so if the user refreshes the page the message is still there, even though it's no longer relevant. There must be a way to do what I want but I'm not sure what it is. Based on my research I think I can do a form POST instead of a GET but that seems kind of heavy weight. Is there a simpler/better way to achieve this effect? I didn't see anything obvious in the RESTXQ docs. Thanks, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Returning REST Response From Updating Function
I have successfully refactored my Web service code to use redirects in order to perform a sequence of updating functions, where the next function depends on updates made by the previous function. The pattern is pretty simple. The first REST handler, which is the public API and must also be declared as %updating, sets up the parameters for the internal calls then does: let $params := map {'param1' : 'somevalue'} return try { db:output(web:redirect($targetURI, $params)) catch * { db:output(web:redirect($errorPageURI, $params) } Then each of the updating functions follows the same pattern, doing whatever updates it does and then returning either a redirect to the next stage or an appropriate error response. I still need to set up logging to a document within the repo, but that will be no problem now that I have the general pattern in place. Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 7/6/15, 8:14 AM, Eliot Kimber basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com wrote: Christian, Thanks for those pointers--lots of interesting stuff. Looking at how those two apps are using db:output() and restxq:redirect it looks like the cleanest solution would be to log processing details to a document in the repo and then set up a REST function that takes the the log ID and presents it in whatever way is appropriate. Shouldn't be too hard to set up. Cheers, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com On 7/5/15, 12:35 PM, Christian Grün christian.gr...@gmail.com wrote: Hi Eliot, Two years ago, two members of our team gave a little demo on RESTXQ [1,2]. Maybe this gives you an idea how we db:output can be used in web applications. Just recently, we added a function that allows you to access the current entries of the output cache [3]. Please note, however, that this is more like a helper function that was mainly integrated for XQUnit tests. Best, Christian [1] http://files.basex.org/publications/xmlprague/2013.html [2] http://files.basex.org/publications/xmlprague/2013.html [3] http://docs.basex.org/wiki/Database_Module#db:output-cache On Sun, Jul 5, 2015 at 6:42 PM, Eliot Kimber ekim...@contrext.com wrote: I'm trying to understand how to use db:output() in the context of a REST function that does a bunch of stuff that updates and then wants to return a result. I have my updating functions using db:output() to return XML elements that are log entries, which I have up to now then formatted as HTML for return by the Web app. However, there doesn't appear to be a way to get the stuff returned by db:output before it gets returned to the ultimate caller. Have I missed something? My REST-handling function is declared as %updating: declare %updating %rest:path(/repo/{$repo}/{$branch}/updateLinkManagementIndexes) Which I understand to be a requirement if the function itself calls any updating functions. I tried e.g.: let $result := f:myUpdatingFunction() return db:output(f:formatLogItems($result)) But that results in the no updating functions message on the variable assignment. Is there a way to do what I want? Thanks, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] Running BaseX In A Docker Container?
As part of my DITA For Small Teams project we're setting up a set of coordinated Docker containers to manage the various components (git server, Jenkins server, BaseX server). We're setting up our own custom containers so that they are appropriately pre-configured. In the case of BaseX that means configured with the DFST Web app and supporting modules, configuration details, etc. Just curious if anyone has experience putting BaseX in a container and if there's anything we should look out for or anything we can contribute back out of this effort. Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] Using BaseX With oXygenXML: Validate XQuery 3.1 Syntax
I've started using oXygenXML rather than the DBA Web app to run my test queries (mostly so I can handle large result sets in the output). If I have a statement like: Let $foo as map(*) := map {} Oxygen reports a static validation error. If I remove the as clause, then it's happy. My question: is it possible to configure oXygenXML so it validates the 3.1 expressions BaseX supports? I'm not sure if this an oXygenXML or a BaseX issue. Thanks, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Merging databases?
If there's to better way you should be able to simply copy the docs without first exporting them: For $dbname in $dbNames Return Let $docs := collection($dbname) For $doc in $docs db:add($newDbName, $doc, $local:makeNewDocUrl($doc)) Or something close to that Cheers, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com On 7/24/15, 2:30 PM, Amanda Galtman basex-talk-boun...@mailman.uni-konstanz.de on behalf of amanda.galt...@mathworks.com wrote: Hi, Is there a simple, fast way to merge databases? For example, suppose I have 200 databases, each with 5 XML documents, and the document paths are unique across all the databases. I want to produce a database containing all 1000 documents. I looked in the Database Module wiki and nothing jumped out as being uniquely tailored for this task. I¹m sure there is an ad hoc way to do it by looping over the documents in the databases, exporting them, and adding them to the new database. But if that takes the same amount of time as building the 200 databases in the first place, I should change other things about my design to avoid the need to merge databases. Thanks, Amanda
[basex-talk] More Array Troubles: Syntax Subtlety That I'm Missing
I have this recursive function that takes an array as its second parameter: declare function lmm:applyScopesToNames( $keyNames as xs:string+, $keyScopes as array(*), $qualifiedNames as xs:string*) as xs:string* { let $result := if (array:size($keyScopes) = 0) then $qualifiedNames else let $newNames := for $name in $keyNames return for $scopeName in $keyScopes(1) return $scopeName || . || $name return lmm:applyScopesToNames( $newNames, array:tail($keyScopes), ($qualifiedNames, $newNames) ) return $result }; Testing with this input: let $keyName := base $keyScopes := [(s3, s4), (s1, s2)] This call: let $qualifiedNames := lmm:applyScopesToNames( ($keyName), $keyScopes, ()) Results in the message Item expected, sequence found: [(s3, s4), (s1, s2)]. Unfortunately, the BaseX interactive query tool doesn't report the line that produces the message, but the only place I'm passing the full array is in the initial call to the function from my test query. I've also verified that if I pass in [] as the keyScopes parameter then I get the expected result (that is, the value of the 3rd parameter is returned). I've verified that $keyScopes is an array of two sequences and as far as I can tell I'm passing arrays where arrays are expected so I'm not sure what my error is, but there must be one. Thanks, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Any Way to Create A Database And Commit To It From REST Service?
I took my option (1) and now have my commit hooks ensure that the database exists, which was easy enough to do (just needed to add an addition CHECK command to my existing BaseX command sequence). It just means that the RESTXQ code itself can't ensure that the metadata database is up to date, at least not without some significant refactoring, which I probably need to do but don't have scope for at the moment. The problem in my case with doing the create and add at once is that there is quite complex business logic involved in creating the documents, so they're not something that can be conveniently created, held in memory, and then committed at once (I could have implemented the processing that way but it's definitely not the natural way to do it). Basically I'm building a metadata cache to optimize queries that would be hard to do by brute force. The number of documents to be created and added could be quite large. As a cache you'd like to be able to have it created or updated on demand--e.g., if I request an operation that requires the cache, the first thing you do is see if the cache is up to date and, if not, create or update it. But with XQuery's constraint on when you can do updates that's not possible, at least not in a naïve way--if you have a function that needs to return a result it cannot also do updating (meaning it can't call any updating functions). I understand the reasoning for that restriction but it is an annoying constraint. In my case my problem is a little easier because the whole BaseX database is really a cache of what's in a separate git repo, so any time the git repo is updated (new commit or new branch creation or check out) I refresh the BaseX database and can regenerate my cache at that time, so it's not a big deal, just a mismatch between how I would do things in e.g., a Java application and how they have to be done using RESTXQ. I have looked at the DBA code but I was having a hard time understanding exactly what it was doing in the short time I had to work on it (although I did see what it was doing with db:output() and redirection). Cheers, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com On 7/13/15, 2:59 AM, Dirk Kirsten basex-talk-boun...@mailman.uni-konstanz.de on behalf of d...@basex.org wrote: Hello Eliot, You can add the documents directly using the db:create() function, thus being able to create and initialize the database. If you have multiple documents to add it can be done like the following example: db:create('new-database', (some-data/, more-data/), (somedata.xml, moredata.xml)) Hope this helps, Dirk On 07/11/2015 07:38 PM, Eliot Kimber wrote: In my link management application I'm trying to refactor my code to move my link management metadata out of the content database (the one that otherwise has the user's XML docs) and into a separate database. The problem I'm running into is that because db:create() is put last on the pending updates list, you can't do return ( db:create('new-database'), db:add('new-database', 'some-data/')) There doesn't seem to be a direct way through the RESTXQ-managed application to create the database since updating functions don't return anything (so I can't, for example, have the root page handler ensure that the database is there simply by calling an XQuery function to initial it). I think the only solutions are: 1. Have the database creation managed outside the REST application (e.g., in my git hooks, which are where the content database creation is done) 2. Use the redirect technique so that, for example, the root page is handled by an updating function that does the database initialization and then returns a redirect to the real root page handler. Have I missed an option? Thanks, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com -- Dirk Kirsten, BaseX GmbH, http://basexgmbh.de |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
[basex-talk] Getting More Diagnostic Info From DBA Query Runner?
I'm using the DBA web app to test and debug my Web app and underlying queries. However, when there are runtime errors it only reports the error, but not the code that produced it. Is there any way for the query runner to report what module/function/line the error resulted from? Or is that information in a log somewhere? Thanks, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] BaseX Pegging Java Process: How To Diagnose?
Using BaseX 8.2.1 HTTP server, not actively performing any queries (but having performed some after restarting the server), BaseX is pegging one of my processors. What can I do to determine what BaseX is doing when it's in this state? Thanks, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Getting More Diagnostic Info From DBA Query Runner?
Using the latest DBA Web app from 8.2.1 when I get an error from a query run from the Query panel the only response I get on that page is [FORG0001] Cannot cast to xs:double: . Using this query: declare %rest:path('invalid') function local:err() { 1 + a/ }; result{local:err()}/result I get the full trace result if I for example try to run one of my own RESTXQ functions and there is an error, but not in the DBA app. I'm using the DBA app to test and debug my code (and it's very convenient for that). Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 7/20/15, 12:03 AM, Christian Grün christian.gr...@gmail.com wrote: However, when there are runtime errors it only reports the error, but not the code that produced it. Sorry, I need more information. The following query... declare %rest:path('invalid') function local:err() { 1 + a/ }; ...results in the following runtime error: Stopped at (...path to webapp...), 2/6: [FORG0001] Cannot cast to xs:double: . Stack Trace: - (...path to webapp...), 1/40
Re: [basex-talk] Getting More Diagnostic Info From DBA Query Runner?
If you can point me at the relevant code I'll put aside my distaste for Javascript and take a look. Cheers, E Eliot Kimber, Owner Contrext, LLC http://contrext.com On 7/20/15, 10:32 AM, Christian Grün christian.gr...@gmail.com wrote: Ok, I got it. The full error is returned to the client, but it is then chopped in the Javascript code. If you are interested, you could have a look at it? On Mon, Jul 20, 2015 at 4:55 PM, Eliot Kimber ekim...@contrext.com wrote: Using the latest DBA Web app from 8.2.1 when I get an error from a query run from the Query panel the only response I get on that page is [FORG0001] Cannot cast to xs:double: . Using this query: declare %rest:path('invalid') function local:err() { 1 + a/ }; result{local:err()}/result I get the full trace result if I for example try to run one of my own RESTXQ functions and there is an error, but not in the DBA app. I'm using the DBA app to test and debug my code (and it's very convenient for that). Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 7/20/15, 12:03 AM, Christian Grün christian.gr...@gmail.com wrote: However, when there are runtime errors it only reports the error, but not the code that produced it. Sorry, I need more information. The following query... declare %rest:path('invalid') function local:err() { 1 + a/ }; ...results in the following runtime error: Stopped at (...path to webapp...), 2/6: [FORG0001] Cannot cast to xs:double: . Stack Trace: - (...path to webapp...), 1/40
Re: [basex-talk] Finding document based on filename
How about (ignore the bad casing--that's Outlook's autocorrect and I'm too lazy to go back and correct every line): Let $docs := collection('/mydir')/* For $doc in $docs Return if (matches(document-uri(root($doc)), '^.+somestring$')) Then $doc Else () Cheers, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com On 8/31/15, 11:35 AM, "Martín Ferrari" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ferrari_mar...@hotmail.com> wrote: >Hi Mansi, I have a similar situation. I don't think there's a fast >way to get documents by only knowing a part of their names. It seems you >need to know the exact name. In my case, we might be able to group >documents by a common id, so we might create subfolders inside the DB and >store/get the contents of the subfolder directly, which is pretty fast. > I've also tried indexing, but insertions got really slow (I assume >maybe because indexing is not granular, it indexes all values) and we >need performance. > > Oh, I've also tried using starts-with() instead of contains(), but >it seems it does not pick up indexes. > >Martín. > > >Date: Fri, 28 Aug 2015 16:52:37 -0400 >From: mansi.sh...@gmail.com >To: basex-talk@mailman.uni-konstanz.de >Subject: [basex-talk] Finding document based on filename > >Hello, >I would be having 100s of databases, with each database having 100 XML >documents. I want to devise an algorithm, where given a part of XML file >name, i want to know which database(s) contains it, or null if document >is not currently present in any database. Based on that, add current >document into the database. This is to always maintain latest version of >a document in DB, and remove the older version, while adding newer >version. > >So far, only way I could come up with is: > >for $db in all-databases: > open $db > $fileNames = list $db >for eachFileName in $fileNames: > if $eachFileName.contains(sub-xml filename): >add to ret-list-db > >return ret-list-db > >Above algorithm, seems highly inefficient, Is there any indexing, which >can be done ? Do you suggest, for each document insert, I should maintain >a separate XML document, which lists each file inserted etc. > >Once, i get hold of above list of db, I would be eventually deleting that >file and inserting a latest version of that file(which would have same >sub-xml file name). So, constant updating of this external document also >seems painful (Map be ?). > >Also, would it be faster, using XQUERY script files, thru java code, or >using Java API for such operations ? > >How do you all deal with such operations ? > >- Mansi > > > > > > >
Re: [basex-talk] BaseX and Git versioning of XML at the same time
I'm doing the reverse: using git hooks to reflect XML files managed in git in BaseX, where BaseX serves as a read-only index over the docs. However, it would be possible to use git's remote API (or the API of a git server like GitLab or GitHub) to reflect changes made in BaseX back into your git repo. You wouldn't do it directly from the file system. My work is part of the DITA for Small Teams project, http://dita-for-small-teams.org. I'll be speaking about it remotely on Friday at the BaseX user meetup in advance of the XML Prague conference. Cheers, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com From: <basex-talk-boun...@mailman.uni-konstanz.de> on behalf of Enea Parimbelli <enea.parimbe...@gmail.com> Date: Wednesday, February 10, 2016 at 6:33 AM To: "basex-talk@mailman.uni-konstanz.de" <basex-talk@mailman.uni-konstanz.de> Subject: [basex-talk] BaseX and Git versioning of XML at the same time Dear all, I'm Enea from Italy (working at the university of Pavia) and this is my first email to this list and I hope you'll forgive my dumb question. For one of my projects I was planning to use BaseX to be able to run Xqueries to manipulate a bunch of XMLs that I have on my filesystem (i.e. to take advantage of what XQuery can do vs manual editing of the xml files). At the same time I would like to have a version control system (Git in my case) control my files to be able to diff, branch, revert changes etc. However the way BaseX internally stores files is preventing me from doing this easily. Here is an example of what I have in mind: 0. I have a bunch of xmls in a folder (versioned with a git repo) 1. I create a baseX database from these 2. Run a bunch of updates/edits on the files using Xquery processor from basex 3. move back to git to diff what I've accomplished, commit, etc. 4. (possibly more advanced, not sure if I really need this) I want to be able to checkout a previous version of the files and have basex do the same (i.e. if I run a query now I want it to run on the version of the xmls that I have currently checked-out in my filesystem) Here are the (dumb) questions: 1. Is there a way for basex (maybe some config settings when I create the database?) to work keeping on the plain xml files? In a way that every update query would actually change the original xml files (which I can view with a plain text editor even "outside" the basex environment)? 2. Is the only way of immediately seeing the changes I made to the xml files to export the database after every update? 3. (assuming 1 is doable) Would it be possible to have baseX recognize the changes made to the xml files using an external editor (i.e. not through xqueries run in basex)? 4. From your experience is BaseX the proper tool for my purposes? It feels to me that I merely want to use its xquery processor capabilities (and I don't need the full-fledged database) while keeping the files plainly stored in my filesystem... any suggestion on alternative options if baseX doesn't sound like the right one? Thanks in advance for the help, and congratulations on the great work (I used basex in the past in a more orthodox way and was impressed by the numerous nice features... including being the nice GUI and the hassle-free integration with tomcat!) Best regards, Enea Parimbelli -- Dr. Enea Parimbelli Post-doctoral research fellow Laboratory for Biomedical Informatics "Mario Stefanelli" Department of Electrical, Computer and Biomedical Engineering University of Pavia, Italy e-mail: enea.parimbe...@gmail.com phone: +39-0382-985057 <tel:%2B39-0382-985057> +39-0382-985981 <tel:%2B39-0382-985981> http://www.labmedinfo.org/people/parimbelli
Re: [basex-talk] Unrecognized Options Running Basex in Docker Container
Looks like debug gives no extra info: basex@f2c1e3d6f9db:~$ basexhttp -d /home/basex/.basex: Unknown option 'CATFILE'. /home/basex/.basex: Unknown option 'DTD'. /home/basex/.basex: Unknown option 'SKIPCORRUPT'. /home/basex/.basex: Unknown option 'CHOP'. /home/basex/.basex: writing new configuration file. DEBUG: true [main] INFO org.eclipse.jetty.server.Server - jetty-8.1.17.v20150415 [main] INFO org.eclipse.jetty.webapp.StandardDescriptorProcessor - NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet WEBPATH: /opt/basex/webapp DEBUG: true BaseX 8.3 [Server] Server was started (port: 1984). [main] INFO org.eclipse.jetty.server.AbstractConnector - Started SelectChannelConnector@0.0.0.0:8984 HTTP Server was started (port: 8984). I did have a problem at one point where I didn't have my file system permissions set correctly and got a "cannot write .basex file" message and debug showed me a java traceback but I resolved that issue. Cheers, E. ---- Eliot Kimber, Owner Contrext, LLC http://contrext.com From: Christian Grün <christian.gr...@gmail.com> Date: Friday, January 29, 2016 at 2:20 AM To: Eliot Kimber <ekim...@contrext.com> Cc: "basex-talk@mailman.uni-konstanz.de" <basex-talk@mailman.uni-konstanz.de> Subject: Re: [basex-talk] Unrecognized Options Running Basex in Docker Container Difficult to tell what may go wrong.. What is output if you start basexhttp in debugging mode (with -d)? Am 29.01.2016 1:10 vorm. schrieb "Eliot Kimber" <ekim...@contrext.com>: > > I'm seeing a difference in how the .basex file is processed between > running a server under OS X directly and running it in a Docker container. > In particular, my local settings are being rejected as unrecognized. > > Here's the startup messages from within the basex container: > > basex@611404b26b04:~$ basexhttp > /home/basex/.basex: Unknown option 'CATFILE'. > /home/basex/.basex: Unknown option 'DTD'. > /home/basex/.basex: Unknown option 'SKIPCORRUPT'. > /home/basex/.basex: Unknown option 'CHOP'. > /home/basex/.basex: writing new configuration file. > [main] INFO org.eclipse.jetty.server.Server - jetty-8.1.17.v20150415 > [main] INFO org.eclipse.jetty.webapp.StandardDescriptorProcessor - NO JSP > Support for /, did not find org.apache.jasper.servlet.JspServlet > BaseX 8.3 [Server] > Server was started (port: 1984). > [main] INFO org.eclipse.jetty.server.AbstractConnector - Started > SelectChannelConnector@0.0.0.0:8984 <http://SelectChannelConnector@0.0.0.0:8984> > HTTP Server was started (port: 8984). > > I verified that with the same .basex file running 8.3 under OS X the > options are recognized and set as expected. > > > What would cause this difference in behavior? > > Here is the .basex file in the /home/basex directory when the server > starts: > > USER = admin > PASSWORD = admin > DEBUG = false > DBPATH = /home/basex/basex/data > REPOPATH = /home/basex/basex/repo > LANG = English > LANGKEYS = false > GLOBALLOCK = false > > > # Client/Server Architecture > HOST = localhost > PORT = 1984 > SERVERPORT = 1984 > SERVERHOST = > PROXYHOST = > PROXYPORT = 0 > NONPROXYHOSTS = > IGNORECERT = false > TIMEOUT = 30 > KEEPALIVE = 600 > PARALLEL = 8 > LOG = true > LOGMSGMAXLEN = 1000 > > > # HTTP Services > WEBPATH = /home/basex/basex/webapp > RESTPATH = > RESTXQPATH = > CACHERESTXQ = false > HTTPLOCAL = false > STOPPORT = 8985 > AUTHMETHOD = Basic > > > # Local options > CATFILE = /opt/dita-ot/DITA-OT/catalog-dita.xml > DTD = true > SKIPCORRUPT = true > CHOP = false > -- (this is the end of the file) -- > > > After the server starts the offending options are omitted from the > rewritten .basex file. > > The only difference I can think of is the Java version. The container uses > the OpenJDK while I have Oracle Java running in OS X: > > basex@611404b26b04:~$ java -version > openjdk version "1.8.0_66-internal" > OpenJDK Runtime Environment (build 1.8.0_66-internal-b17) > OpenJDK 64-Bit Server VM (build 25.66-b17, mixed mode) > basex@611404b26b04:~$ > > But otherwise it's exactly the same code running in both environments and > the same config file. > > Thanks, > > Eliot > > > Eliot Kimber, Owner > Contrext, LLC > http://contrext.com > > >
Re: [basex-talk] Unrecognized Options Running Basex in Docker Container
Another bit of info. If I interact with the server from within the container, e.g., connect to the basex server using the basexclient, I can set and get options but they are not reflected in the DBA app: > open test-02 Database 'test-02' was not found. [I then create test-02 using the DBA Web app to prove I'm talking to the same server] > open test-02 Database 'test-02' was opened in 0.68 ms. > set chop false CHOP: false > get chop CHOP: false But if I refresh the page for test-02 in the DBA app, it still shows a check for chop. So something seems to be up with local options. I verified that I can add data to the database itself. Thanks, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 1/28/16, 6:10 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: >I'm seeing a difference in how the .basex file is processed between >running a server under OS X directly and running it in a Docker container. >In particular, my local settings are being rejected as unrecognized. > >Here's the startup messages from within the basex container: > >basex@611404b26b04:~$ basexhttp >/home/basex/.basex: Unknown option 'CATFILE'. >/home/basex/.basex: Unknown option 'DTD'. >/home/basex/.basex: Unknown option 'SKIPCORRUPT'. >/home/basex/.basex: Unknown option 'CHOP'. >/home/basex/.basex: writing new configuration file. >[main] INFO org.eclipse.jetty.server.Server - jetty-8.1.17.v20150415 >[main] INFO org.eclipse.jetty.webapp.StandardDescriptorProcessor - NO JSP >Support for /, did not find org.apache.jasper.servlet.JspServlet >BaseX 8.3 [Server] >Server was started (port: 1984). >[main] INFO org.eclipse.jetty.server.AbstractConnector - Started >SelectChannelConnector@0.0.0.0:8984 >HTTP Server was started (port: 8984). > >I verified that with the same .basex file running 8.3 under OS X the >options are recognized and set as expected. > > >What would cause this difference in behavior? > >Here is the .basex file in the /home/basex directory when the server >starts: > >USER = admin >PASSWORD = admin >DEBUG = false >DBPATH = /home/basex/basex/data >REPOPATH = /home/basex/basex/repo >LANG = English >LANGKEYS = false >GLOBALLOCK = false > > ># Client/Server Architecture >HOST = localhost >PORT = 1984 >SERVERPORT = 1984 >SERVERHOST = >PROXYHOST = >PROXYPORT = 0 >NONPROXYHOSTS = >IGNORECERT = false >TIMEOUT = 30 >KEEPALIVE = 600 >PARALLEL = 8 >LOG = true >LOGMSGMAXLEN = 1000 > > ># HTTP Services >WEBPATH = /home/basex/basex/webapp >RESTPATH = >RESTXQPATH = >CACHERESTXQ = false >HTTPLOCAL = false >STOPPORT = 8985 >AUTHMETHOD = Basic > > ># Local options >CATFILE = /opt/dita-ot/DITA-OT/catalog-dita.xml >DTD = true >SKIPCORRUPT = true >CHOP = false >-- (this is the end of the file) -- > > >After the server starts the offending options are omitted from the >rewritten .basex file. > >The only difference I can think of is the Java version. The container uses >the OpenJDK while I have Oracle Java running in OS X: > >basex@611404b26b04:~$ java -version >openjdk version "1.8.0_66-internal" >OpenJDK Runtime Environment (build 1.8.0_66-internal-b17) >OpenJDK 64-Bit Server VM (build 25.66-b17, mixed mode) >basex@611404b26b04:~$ > >But otherwise it's exactly the same code running in both environments and >the same config file. > >Thanks, > >Eliot > > >Eliot Kimber, Owner >Contrext, LLC >http://contrext.com > > > >
Re: [basex-talk] Unrecognized Options Running Basex in Docker Container
I appears to have been an issue with the configuration file itself (probably a Windows vs Linux line endings problem). When I took Michael Seiferle's advice and appended my settings to the base catalog, rather than replacing it entirely, it worked as expected. I've replaced my use of my own basex Docker image with the version Michael is maintaining (basex/basexhttp). One less thing for me to worry about :-) Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com From: <basex-talk-boun...@mailman.uni-konstanz.de> on behalf of Eliot Kimber <ekim...@contrext.com> Date: Friday, January 29, 2016 at 9:53 AM To: "basex-talk@mailman.uni-konstanz.de" <basex-talk@mailman.uni-konstanz.de> Subject: Re: [basex-talk] Unrecognized Options Running Basex in Docker Container Looks like debug gives no extra info: basex@f2c1e3d6f9db:~$ basexhttp -d /home/basex/.basex: Unknown option 'CATFILE'. /home/basex/.basex: Unknown option 'DTD'. /home/basex/.basex: Unknown option 'SKIPCORRUPT'. /home/basex/.basex: Unknown option 'CHOP'. /home/basex/.basex: writing new configuration file. DEBUG: true [main] INFO org.eclipse.jetty.server.Server - jetty-8.1.17.v20150415 [main] INFO org.eclipse.jetty.webapp.StandardDescriptorProcessor - NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet WEBPATH: /opt/basex/webapp DEBUG: true BaseX 8.3 [Server] Server was started (port: 1984). [main] INFO org.eclipse.jetty.server.AbstractConnector - Started SelectChannelConnector@0.0.0.0:8984 HTTP Server was started (port: 8984). I did have a problem at one point where I didn't have my file system permissions set correctly and got a "cannot write .basex file" message and debug showed me a java traceback but I resolved that issue. Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com From: Christian Grün <christian.gr...@gmail.com> Date: Friday, January 29, 2016 at 2:20 AM To: Eliot Kimber <ekim...@contrext.com> Cc: "basex-talk@mailman.uni-konstanz.de" <basex-talk@mailman.uni-konstanz.de> Subject: Re: [basex-talk] Unrecognized Options Running Basex in Docker Container Difficult to tell what may go wrong.. What is output if you start basexhttp in debugging mode (with -d)? Am 29.01.2016 1:10 vorm. schrieb "Eliot Kimber" <ekim...@contrext.com>: > > I'm seeing a difference in how the .basex file is processed between > running a server under OS X directly and running it in a Docker container. > In particular, my local settings are being rejected as unrecognized. > > Here's the startup messages from within the basex container: > > basex@611404b26b04:~$ basexhttp > /home/basex/.basex: Unknown option 'CATFILE'. > /home/basex/.basex: Unknown option 'DTD'. > /home/basex/.basex: Unknown option 'SKIPCORRUPT'. > /home/basex/.basex: Unknown option 'CHOP'. > /home/basex/.basex: writing new configuration file. > [main] INFO org.eclipse.jetty.server.Server - jetty-8.1.17.v20150415 > [main] INFO org.eclipse.jetty.webapp.StandardDescriptorProcessor - NO JSP > Support for /, did not find org.apache.jasper.servlet.JspServlet > BaseX 8.3 [Server] > Server was started (port: 1984). > [main] INFO org.eclipse.jetty.server.AbstractConnector - Started > SelectChannelConnector@0.0.0.0:8984 <http://SelectChannelConnector@0.0.0.0:8984> > HTTP Server was started (port: 8984). > > I verified that with the same .basex file running 8.3 under OS X the > options are recognized and set as expected. > > > What would cause this difference in behavior? > > Here is the .basex file in the /home/basex directory when the server > starts: > > USER = admin > PASSWORD = admin > DEBUG = false > DBPATH = /home/basex/basex/data > REPOPATH = /home/basex/basex/repo > LANG = English > LANGKEYS = false > GLOBALLOCK = false > > > # Client/Server Architecture > HOST = localhost > PORT = 1984 > SERVERPORT = 1984 > SERVERHOST = > PROXYHOST = > PROXYPORT = 0 > NONPROXYHOSTS = > IGNORECERT = false > TIMEOUT = 30 > KEEPALIVE = 600 > PARALLEL = 8 > LOG = true > LOGMSGMAXLEN = 1000 > > > # HTTP Services > WEBPATH = /home/basex/basex/webapp > RESTPATH = > RESTXQPATH = > CACHERESTXQ = false > HTTPLOCAL = false > STOPPORT = 8985 > AUTHMETHOD = Basic > > > # Local options > CATFILE = /opt/dita-ot/DITA-OT/catalog-dita.xml > DTD = true > SKIPCORRUPT = true > CHOP = false > -- (this is the end of the file) -- > > > After the server starts the offending options are omitted from the > rewritten .basex file. > > The only difference I can think of is the Java version. The container uses > the OpenJDK while I have Oracle Java running in OS X: > > basex@611404b26b04:~$ java -version > openjdk version &
Re: [basex-talk] Status of BaseX Docker Container?
OK, I'm going to proceed with making a generic basex Docker container in the DFST Docker project on GitHub. I'm using Andreas' as a starting point but it's pretty trivial. I'm working/testing on OS X, Windows, and CentOS so I should be able to get something going. Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 1/26/16, 2:02 AM, "Christian Grün" <christian.gr...@gmail.com> wrote: >Hi Eliot, > >Personally, I had a hard time making Docker work on Windows machines, >but we have various Docker aficionados in our team and around, so I >hope they¹ll give you some feedback soon. > >Cheers, >Christian > > >On Sun, Jan 24, 2016 at 8:22 PM, Eliot Kimber <ekim...@contrext.com> >wrote: >> I'm working toward using BaseX in a Docker container as part of the DITA >> for Small Teams project (we're trying to set up a system of Docker >> containers with all the DFST parts integrated out of the box). >> >> I notice that Andreas Jung has create a simple container here: >> >> https://github.com/zopyx/docker-basex >> >> And pushed it to the Docker Hub in the basex namespace. >> >> I'm wondering what the relationship of this is, if any, to any official >> BaseX Docker support? >> >> Thanks, >> >> Eliot >> >> Eliot Kimber, Owner >> Contrext, LLC >> http://contrext.com >> >> >> >> >
Re: [basex-talk] Status of BaseX Docker Container?
For what it's worth, I've pushed a generic BaseX HTTP container to Docker Hub as "dfst/basex". It exposes the default ports for the base and HTTP servers (1984 and 8984). >From a Linux system you connect by using the IP address of the Docker network, e.g.: http://172.17.0.2:8984/ Under Windows: The DockerFile is in GitHub here: https://github.com/dita-for-small-teams/dfst-docker My next task will be to create a container based on this one that includes specific configuration and the DFST Web application modules. Should be easy once I get to it. Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 1/26/16, 9:48 AM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: >OK, I'm going to proceed with making a generic basex Docker container in >the DFST Docker project on GitHub. I'm using Andreas' as a starting point >but it's pretty trivial. > >I'm working/testing on OS X, Windows, and CentOS so I should be able to >get something going. > >Cheers, > >E. > >Eliot Kimber, Owner >Contrext, LLC >http://contrext.com > > > > >On 1/26/16, 2:02 AM, "Christian Grün" <christian.gr...@gmail.com> wrote: > >>Hi Eliot, >> >>Personally, I had a hard time making Docker work on Windows machines, >>but we have various Docker aficionados in our team and around, so I >>hope they¹ll give you some feedback soon. >> >>Cheers, >>Christian >> >> >>On Sun, Jan 24, 2016 at 8:22 PM, Eliot Kimber <ekim...@contrext.com> >>wrote: >>> I'm working toward using BaseX in a Docker container as part of the >>>DITA >>> for Small Teams project (we're trying to set up a system of Docker >>> containers with all the DFST parts integrated out of the box). >>> >>> I notice that Andreas Jung has create a simple container here: >>> >>> https://github.com/zopyx/docker-basex >>> >>> And pushed it to the Docker Hub in the basex namespace. >>> >>> I'm wondering what the relationship of this is, if any, to any official >>> BaseX Docker support? >>> >>> Thanks, >>> >>> Eliot >>> >>> Eliot Kimber, Owner >>> Contrext, LLC >>> http://contrext.com >>> >>> >>> >>> >> > > >
Re: [basex-talk] Status of BaseX Docker Container?
Forgot to add the Windows part: Under windows have to explicitly publish the ports when running the container: docker run --name=basex -p 8984:8984 -p 1984:1984 dfst/basex then you can use the IP address of the docker-engine VM to connect to the server: http://192.168.99.100:8984/ you can get the ip address using the "docker-machine ip" command: c:\Program Files\Docker Toolbox>docker-machine ls NAME ACTIVE URL STATE URL SWARM DOCKER ERRORS default -virtualbox Running tcp://192.168.99.100:2376 v1.9.1 c:\Program Files\Docker Toolbox>docker-machine ip default 192.168.99.100 the docker-machine command replaces the older boot2docker command So now I have BaseX server running and accessible as Docker containers under Windows and CentOS. OSX is still a problem: there's some issue with the docker daemon and getting the docker client to connect to it. Haven't been able to find a good solution in my searching. Very annoying. Cheers, E. ---- Eliot Kimber, Owner Contrext, LLC http://contrext.com On 1/26/16, 3:50 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: >For what it's worth, I've pushed a generic BaseX HTTP container to Docker >Hub as "dfst/basex". It exposes the default ports for the base and HTTP >servers (1984 and 8984). > >From a Linux system you connect by using the IP address of the Docker >network, e.g.: > >http://172.17.0.2:8984/ > >Under Windows: > >The DockerFile is in GitHub here: > >https://github.com/dita-for-small-teams/dfst-docker > >My next task will be to create a container based on this one that includes >specific configuration and the DFST Web application modules. Should be >easy once I get to it. > >Cheers, > >E. > > >Eliot Kimber, Owner >Contrext, LLC >http://contrext.com > > > > >On 1/26/16, 9:48 AM, "Eliot Kimber" ><basex-talk-boun...@mailman.uni-konstanz.de on behalf of >ekim...@contrext.com> wrote: > >>OK, I'm going to proceed with making a generic basex Docker container in >>the DFST Docker project on GitHub. I'm using Andreas' as a starting point >>but it's pretty trivial. >> >>I'm working/testing on OS X, Windows, and CentOS so I should be able to >>get something going. >> >>Cheers, >> >>E. >> >>Eliot Kimber, Owner >>Contrext, LLC >>http://contrext.com >> >> >> >> >>On 1/26/16, 2:02 AM, "Christian Grün" <christian.gr...@gmail.com> wrote: >> >>>Hi Eliot, >>> >>>Personally, I had a hard time making Docker work on Windows machines, >>>but we have various Docker aficionados in our team and around, so I >>>hope they¹ll give you some feedback soon. >>> >>>Cheers, >>>Christian >>> >>> >>>On Sun, Jan 24, 2016 at 8:22 PM, Eliot Kimber <ekim...@contrext.com> >>>wrote: >>>> I'm working toward using BaseX in a Docker container as part of the >>>>DITA >>>> for Small Teams project (we're trying to set up a system of Docker >>>> containers with all the DFST parts integrated out of the box). >>>> >>>> I notice that Andreas Jung has create a simple container here: >>>> >>>> https://github.com/zopyx/docker-basex >>>> >>>> And pushed it to the Docker Hub in the basex namespace. >>>> >>>> I'm wondering what the relationship of this is, if any, to any >>>>official >>>> BaseX Docker support? >>>> >>>> Thanks, >>>> >>>> Eliot >>>> >>>> Eliot Kimber, Owner >>>> Contrext, LLC >>>> http://contrext.com >>>> >>>> >>>> >>>> >>> >> >> >> > > >
[basex-talk] Status of BaseX Docker Container?
I'm working toward using BaseX in a Docker container as part of the DITA for Small Teams project (we're trying to set up a system of Docker containers with all the DFST parts integrated out of the box). I notice that Andreas Jung has create a simple container here: https://github.com/zopyx/docker-basex And pushed it to the Docker Hub in the basex namespace. I'm wondering what the relationship of this is, if any, to any official BaseX Docker support? Thanks, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com
[basex-talk] Where Is Documentation On Module Lookup?
I'm trying to find the documentation on how BaseX looks up modules and I can't find it in the current documentation. I know I found it in the past. Where should I be looking? Thanks, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Where Is Documentation On Module Lookup?
Ah, thanks. For some reason the title "Repository" did not suggest the information I was looking for. But it's all coming back to me now. I was looking for "how to implement custom XQuery modules". Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 2/17/16, 10:23 AM, "Christian Grün" <christian.gr...@gmail.com> wrote: >Hi Eliot, > >You should find part of the information here [1]. Feel free to ask if >you need some more specific information. > >Christian > >[1] http://docs.basex.org/wiki/Repository > > >On Wed, Feb 17, 2016 at 5:15 PM, Eliot Kimber <ekim...@contrext.com> >wrote: >> I'm trying to find the documentation on how BaseX looks up modules and I >> can't find it in the current documentation. I know I found it in the >>past. >> >> Where should I be looking? >> >> Thanks, >> >> Eliot >> >> >> Eliot Kimber, Owner >> Contrext, LLC >> http://contrext.com >> >> >> >
[basex-talk] Sending Bytes, Not Strings, To BaseX Using the Ruby Client
I'm implementing server-side git hooks for use in GitLab under Docker where Java is not available (at least that I can see). The hooks load or delete files from databases in BaseX. I'm trying to implement the hooks in Ruby (which is much more pleasant than bash scripting in any case) and I'm using the BaseXClient.rb from https://github.com/BaseXdb/basex/tree/master/basex-api/src/main/ruby I need to create or replace files by sending the bytes--I'd rather not read the input file into a Ruby string and send that since I don't trust Ruby to not hose up the data (even when it's UTF-8 I still don't trust it, but I only started using Ruby yesterday so maybe my mistrust is misplaced?). Using the AddExample.rb as guide, I'm doing this: (Earlier code to open or create database, which works). file = File.new("../../" + path, "rb") bytes = file.read file.close puts "file=/#{bytes}/" @basex.add(path, "#{bytes}") I also tried: @basex.add(path, bytes) And I get this result (I added some debugging messages to sendCmd()): ensureDatabase(): Checking database "_dfst^metadata^temp^master"... BaseXResult: Database '_dfst^metadata^temp^master' was opened in 1.53 ms. Added or modified file: "test-newname.xml" file=/This is a test 20 / *** sendCmd(): cmd= arg=test-newname.xml input=This is a test 20 BaseXClient.rb:110:in `sendCmd': "test-newname.xml.xml" (Line 1): Premature end of file. (RuntimeError) from commit-hooks/git/server-side/BaseXClient.rb:64:in `add' from commit-hooks/git/server-side/post-receive:80:in `block in update' from commit-hooks/git/server-side/post-receive:74:in `each' from commit-hooks/git/server-side/post-receive:74:in `update' from commit-hooks/git/server-side/post-receive:111:in `block in ' from commit-hooks/git/server-side/post-receive:103:in `each' from commit-hooks/git/server-side/post-receive:103:in `' Eliots-MBP:hooks ekimber$ A couple of things here: Where is the extra ".xml" in the target filename coming from? What is causing the premature end of file? It feels like it's trying interpret the second argument as a filename rather than the data to be loaded. If I use basex.execute("add to #{path} #{bytes}") it works but of course I get duplicate files if I run the command twice. If I try: @basex.execute("replace #{path} #{bytes}") Then I get the same failure. So something is not right. My Docker container is running 8.4.1 beta. What am I missing? Thanks, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Sending Bytes, Not Strings, To BaseX Using the Ruby Client
I turned my UTF-8 file into a UTF-16 file and trying to commit it to BaseX via the Ruby client it did not work: BaseXClient.rb:50:in `execute': Resource "/opt/basex/?" not found. (RuntimeError) Where "?" is some kind of "unrecognized character" indicator Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 2/18/16, 10:26 AM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: >I'm implementing server-side git hooks for use in GitLab under Docker >where Java is not available (at least that I can see). The hooks load or >delete files from databases in BaseX. > >I'm trying to implement the hooks in Ruby (which is much more pleasant >than bash scripting in any case) and I'm using the BaseXClient.rb from >https://github.com/BaseXdb/basex/tree/master/basex-api/src/main/ruby > >I need to create or replace files by sending the bytes--I'd rather not >read the input file into a Ruby string and send that since I don't trust >Ruby to not hose up the data (even when it's UTF-8 I still don't trust it, >but I only started using Ruby yesterday so maybe my mistrust is >misplaced?). > >Using the AddExample.rb as guide, I'm doing this: > >(Earlier code to open or create database, which works). > >file = File.new("../../" + path, "rb") >bytes = file.read >file.close >puts "file=/#{bytes}/" >@basex.add(path, "#{bytes}") > >I also tried: > >@basex.add(path, bytes) > > > >And I get this result (I added some debugging messages to sendCmd()): > >ensureDatabase(): Checking database "_dfst^metadata^temp^master"... >BaseXResult: Database '_dfst^metadata^temp^master' was opened in 1.53 ms. >Added or modified file: "test-newname.xml" >file=/This is a test 20 >/ > >*** sendCmd(): >cmd= >arg=test-newname.xml >input=This is a test 20 >BaseXClient.rb:110:in `sendCmd': "test-newname.xml.xml" (Line 1): >Premature end of file. (RuntimeError) > > from commit-hooks/git/server-side/BaseXClient.rb:64:in `add' > from commit-hooks/git/server-side/post-receive:80:in `block in update' > from commit-hooks/git/server-side/post-receive:74:in `each' > from commit-hooks/git/server-side/post-receive:74:in `update' > from commit-hooks/git/server-side/post-receive:111:in `block in ' > from commit-hooks/git/server-side/post-receive:103:in `each' > from commit-hooks/git/server-side/post-receive:103:in `' >Eliots-MBP:hooks ekimber$ > >A couple of things here: > > >Where is the extra ".xml" in the target filename coming from? > >What is causing the premature end of file? It feels like it's trying >interpret the second argument as a filename rather than the data to be >loaded. > >If I use basex.execute("add to #{path} #{bytes}") it works but of course I >get duplicate files if I run the command twice. > >If I try: > >@basex.execute("replace #{path} #{bytes}") > >Then I get the same failure. > > >So something is not right. > >My Docker container is running 8.4.1 beta. > >What am I missing? > >Thanks, > >Eliot > >Eliot Kimber, Owner >Contrext, LLC >http://contrext.com > > > >
Re: [basex-talk] Sending Bytes, Not Strings, To BaseX Using the Ruby Client
This test document as a non-ascii character '〺' (\u303A), which I added to test handling of multi-byte characters. Ruby and the BaseX client seem to be handling the UTF-8 correctly but UTF-16 didn't. I'm guessing it's Ruby's fault because it's treating the bytes as a string and of course that's not going to work in a naive way. Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 2/18/16, 11:04 AM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: >I turned my UTF-8 file into a UTF-16 file and trying to commit it to BaseX >via the Ruby client it did not work: > >BaseXClient.rb:50:in `execute': Resource "/opt/basex/?" not found. >(RuntimeError) > >Where "?" is some kind of "unrecognized character" indicator > >Cheers, > >E. > > > >Eliot Kimber, Owner >Contrext, LLC >http://contrext.com > > > > >On 2/18/16, 10:26 AM, "Eliot Kimber" ><basex-talk-boun...@mailman.uni-konstanz.de on behalf of >ekim...@contrext.com> wrote: > >>I'm implementing server-side git hooks for use in GitLab under Docker >>where Java is not available (at least that I can see). The hooks load or >>delete files from databases in BaseX. >> >>I'm trying to implement the hooks in Ruby (which is much more pleasant >>than bash scripting in any case) and I'm using the BaseXClient.rb from >>https://github.com/BaseXdb/basex/tree/master/basex-api/src/main/ruby >> >>I need to create or replace files by sending the bytes--I'd rather not >>read the input file into a Ruby string and send that since I don't trust >>Ruby to not hose up the data (even when it's UTF-8 I still don't trust >>it, >>but I only started using Ruby yesterday so maybe my mistrust is >>misplaced?). >> >>Using the AddExample.rb as guide, I'm doing this: >> >>(Earlier code to open or create database, which works). >> >>file = File.new("../../" + path, "rb") >>bytes = file.read >>file.close >>puts "file=/#{bytes}/" >>@basex.add(path, "#{bytes}") >> >>I also tried: >> >>@basex.add(path, bytes) >> >> >> >>And I get this result (I added some debugging messages to sendCmd()): >> >>ensureDatabase(): Checking database "_dfst^metadata^temp^master"... >>BaseXResult: Database '_dfst^metadata^temp^master' was opened in 1.53 ms. >>Added or modified file: "test-newname.xml" >>file=/This is a test 20 >>/ >> >>*** sendCmd(): >>cmd= >>arg=test-newname.xml >>input=This is a test 20 >>BaseXClient.rb:110:in `sendCmd': "test-newname.xml.xml" (Line 1): >>Premature end of file. (RuntimeError) >> >> from commit-hooks/git/server-side/BaseXClient.rb:64:in `add' >> from commit-hooks/git/server-side/post-receive:80:in `block in update' >> from commit-hooks/git/server-side/post-receive:74:in `each' >> from commit-hooks/git/server-side/post-receive:74:in `update' >> from commit-hooks/git/server-side/post-receive:111:in `block in ' >> from commit-hooks/git/server-side/post-receive:103:in `each' >> from commit-hooks/git/server-side/post-receive:103:in `' >>Eliots-MBP:hooks ekimber$ >> >>A couple of things here: >> >> >>Where is the extra ".xml" in the target filename coming from? >> >>What is causing the premature end of file? It feels like it's trying >>interpret the second argument as a filename rather than the data to be >>loaded. >> >>If I use basex.execute("add to #{path} #{bytes}") it works but of course >>I >>get duplicate files if I run the command twice. >> >>If I try: >> >>@basex.execute("replace #{path} #{bytes}") >> >>Then I get the same failure. >> >> >>So something is not right. >> >>My Docker container is running 8.4.1 beta. >> >>What am I missing? >> >>Thanks, >> >>Eliot >> >>Eliot Kimber, Owner >>Contrext, LLC >>http://contrext.com >> >> >> >> > > >
Re: [basex-talk] Sending Bytes, Not Strings, To BaseX Using the Ruby Client
Hmm. Is there a way to send other encodings to the server via the remote API? I'm on my way to Japan for a workshop where we'll be using my system and Japanese-language documents are more efficiently stored in UTF-16 so my expectation is that users will either already have documents in that encoding or will create new ones. Of course, for the workshop we can limit ourselves to UTF-8 but I'm trying to make the system as foolproof as possible. I think the issue with my script was that I was putting quotes around the XML strings, which causes the server to treat it as a file path rather than as XML to load. Once I fixed that then I was able to delete and add files from my Ruby git hooks. I'll have to get a better understanding of how Ruby handles arbitrary byte sequences (this is where there's a little too much magic for my taste) but I would expect that if I provide the remote API with a byte sequence that starts with 0xFFFE, 0xFEFF, 0x003C003F, or 0x3C003F00 that it would treat it as UTF-16. Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 2/18/16, 4:58 PM, "Christian Grün" <christian.gr...@gmail.com> wrote: >Hi Eliot, > >For most client bindings, files must indeed be sent in UTF-8, so I >guess it’s also the case for the Ruby binding. If the sent bytes are >correct UTF-8, everything should work be fine. > >Christian > > >On Thu, Feb 18, 2016 at 6:08 PM, Eliot Kimber <ekim...@contrext.com> >wrote: >> This test document as a non-ascii character '〺' (\u303A), which I added >>to >> test handling of multi-byte characters. >> >> Ruby and the BaseX client seem to be handling the UTF-8 correctly but >> UTF-16 didn't. I'm guessing it's Ruby's fault because it's treating the >> bytes as a string and of course that's not going to work in a naive way. >> >> Cheers, >> >> E. >> >> Eliot Kimber, Owner >> Contrext, LLC >> http://contrext.com >> >> >> >> >> On 2/18/16, 11:04 AM, "Eliot Kimber" >> <basex-talk-boun...@mailman.uni-konstanz.de on behalf of >> ekim...@contrext.com> wrote: >> >>>I turned my UTF-8 file into a UTF-16 file and trying to commit it to >>>BaseX >>>via the Ruby client it did not work: >>> >>>BaseXClient.rb:50:in `execute': Resource "/opt/basex/?" not found. >>>(RuntimeError) >>> >>>Where "?" is some kind of "unrecognized character" indicator >>> >>>Cheers, >>> >>>E. >>> >>> >>> >>>Eliot Kimber, Owner >>>Contrext, LLC >>>http://contrext.com >>> >>> >>> >>> >>>On 2/18/16, 10:26 AM, "Eliot Kimber" >>><basex-talk-boun...@mailman.uni-konstanz.de on behalf of >>>ekim...@contrext.com> wrote: >>> >>>>I'm implementing server-side git hooks for use in GitLab under Docker >>>>where Java is not available (at least that I can see). The hooks load >>>>or >>>>delete files from databases in BaseX. >>>> >>>>I'm trying to implement the hooks in Ruby (which is much more pleasant >>>>than bash scripting in any case) and I'm using the BaseXClient.rb from >>>>https://github.com/BaseXdb/basex/tree/master/basex-api/src/main/ruby >>>> >>>>I need to create or replace files by sending the bytes--I'd rather not >>>>read the input file into a Ruby string and send that since I don't >>>>trust >>>>Ruby to not hose up the data (even when it's UTF-8 I still don't trust >>>>it, >>>>but I only started using Ruby yesterday so maybe my mistrust is >>>>misplaced?). >>>> >>>>Using the AddExample.rb as guide, I'm doing this: >>>> >>>>(Earlier code to open or create database, which works). >>>> >>>>file = File.new("../../" + path, "rb") >>>>bytes = file.read >>>>file.close >>>>puts "file=/#{bytes}/" >>>>@basex.add(path, "#{bytes}") >>>> >>>>I also tried: >>>> >>>>@basex.add(path, bytes) >>>> >>>> >>>> >>>>And I get this result (I added some debugging messages to sendCmd()): >>>> >>>>ensureDatabase(): Checking database "_dfst^metadata^temp^master"... >>>>BaseXResult: Database '_dfst^metadata^temp^master' was opened in 1.53 >>>>ms. >>>>Added or modified file: "test-newna
Re: [basex-talk] Sending Bytes, Not Strings, To BaseX Using the Ruby Client
I suspect that using the REST API directly is the answer, although if I understand the Ruby client code, it's using direct socket connections, which would be more efficient. Not a critical issue at the moment but something I need to understand more fully before too long. Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 2/19/16, 9:31 AM, "Christian Grün" <christian.gr...@gmail.com> wrote: >> Hmm. Is there a way to send other encodings to the server via the remote >> API? > >A difficult one for me to answer, because I have never worked with >Ruby before… Maybe there are some other users on the list who can >reply on this? > >> I'm on my way to Japan for a workshop where we'll be using my system and >> Japanese-language documents are more efficiently stored in UTF-16 so my >> expectation is that users will either already have documents in that >> encoding or will create new ones. Of course, for the workshop we can >>limit >> ourselves to UTF-8 but I'm trying to make the system as foolproof as >> possible. > >Sounds interesting, and absolutely reasonable. Maybe our HTTP services >(e.g. the default REST API) could be an alternative? > >Christian > > >> I think the issue with my script was that I was putting quotes around >>the >> XML strings, which causes the server to treat it as a file path rather >> than as XML to load. Once I fixed that then I was able to delete and add >> files from my Ruby git hooks. >> >> I'll have to get a better understanding of how Ruby handles arbitrary >>byte >> sequences (this is where there's a little too much magic for my taste) >>but >> I would expect that if I provide the remote API with a byte sequence >>that >> starts with 0xFFFE, 0xFEFF, 0x003C003F, or 0x3C003F00 that it would >>treat >> it as UTF-16. >> >> Cheers, >> >> E. >> >> Eliot Kimber, Owner >> Contrext, LLC >> http://contrext.com >> >> >> >> >> On 2/18/16, 4:58 PM, "Christian Grün" <christian.gr...@gmail.com> wrote: >> >>>Hi Eliot, >>> >>>For most client bindings, files must indeed be sent in UTF-8, so I >>>guess it’s also the case for the Ruby binding. If the sent bytes are >>>correct UTF-8, everything should work be fine. >>> >>>Christian >>> >>> >>>On Thu, Feb 18, 2016 at 6:08 PM, Eliot Kimber <ekim...@contrext.com> >>>wrote: >>>> This test document as a non-ascii character '〺' (\u303A), which I >>>>added >>>>to >>>> test handling of multi-byte characters. >>>> >>>> Ruby and the BaseX client seem to be handling the UTF-8 correctly but >>>> UTF-16 didn't. I'm guessing it's Ruby's fault because it's treating >>>>the >>>> bytes as a string and of course that's not going to work in a naive >>>>way. >>>> >>>> Cheers, >>>> >>>> E. >>>> >>>> Eliot Kimber, Owner >>>> Contrext, LLC >>>> http://contrext.com >>>> >>>> >>>> >>>> >>>> On 2/18/16, 11:04 AM, "Eliot Kimber" >>>> <basex-talk-boun...@mailman.uni-konstanz.de on behalf of >>>> ekim...@contrext.com> wrote: >>>> >>>>>I turned my UTF-8 file into a UTF-16 file and trying to commit it to >>>>>BaseX >>>>>via the Ruby client it did not work: >>>>> >>>>>BaseXClient.rb:50:in `execute': Resource "/opt/basex/?" not found. >>>>>(RuntimeError) >>>>> >>>>>Where "?" is some kind of "unrecognized character" indicator >>>>> >>>>>Cheers, >>>>> >>>>>E. >>>>> >>>>> >>>>> >>>>>Eliot Kimber, Owner >>>>>Contrext, LLC >>>>>http://contrext.com >>>>> >>>>> >>>>> >>>>> >>>>>On 2/18/16, 10:26 AM, "Eliot Kimber" >>>>><basex-talk-boun...@mailman.uni-konstanz.de on behalf of >>>>>ekim...@contrext.com> wrote: >>>>> >>>>>>I'm implementing server-side git hooks for use in GitLab under Docker >>>>>>where Java is not available (at least that I can see). The hooks load >>>>>>or >>>>>>delete files from databases in BaseX. >>>>
Re: [basex-talk] Automated Docker Build: basex/basexhttp
Cool--I'll try it as soon as I can. I have my own BaseX-based container with a custom Web app working now, but based on my own small mod of an earlier Dockerfile. Looks like this version addresses the issue I had (creating volumes in the base container made it impossible to add repos or webapps in using Dockerfiles). Cheers, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com On 2/19/16, 3:54 PM, "Jens Erat" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of jens.e...@uni-konstanz.de> wrote: >Dear BaseX community, > >over the last weeks, interest in Docker utilization with BaseX hevily >increased, and several images have been proposed. I've already made some >experience running a BaseX pet project in BaseX for about a year now, >and together with the BaseX core team created an image based on those >experiences and current best practices. > > >basex/basexhttp Docker Image > > >Finally there's the official basex/basexhttp Docker image readily >available on the Docker Hub. It's an automated build directly from >source, with nightly builds and tagged builds for future BaseX releases >(starting with 8.4.1/8.5, whatever will come first). It is derived from >the Maven base image (and thus Debian), and is automatically updated >when the base images experience updates (like Java security fixes). > >- https://hub.docker.com/r/basex/basexhttp/ >- https://github.com/BaseXdb/basex/blob/master/Dockerfile > >While the image is named basexhttp and always includes the HTTP server, >it can also be used for running the plain basexserver or even basexclient. > > >DBA Application Container >- > >As an example for deriving your own application images and also for >interfacing BaseX for administrative tasks and ad-hoc queries, the DBA >is also made available as a container. > >- https://hub.docker.com/r/basex/dba/ >- >https://github.com/BaseXdb/basex/tree/master/basex-api/src/main/webapp/dba > > >Documentation >- > >As you're used to, documentation is available in the BaseX wiki. > >- http://docs.basex.org/wiki/Docker > >- - - > >If you've got any feedback, questions or proposals, feel free to get in >touch with me or the core BaseX team on the usual ways. > >Kind regards from Lake Constance, Germany, >Jens > >-- >Jens Erat > > [phone]: tel:+49-151-56961126 > [mail]: mailto:em...@jenserat.de >[jabber]: xmpp:jab...@jenserat.de > [web]: http://www.jenserat.de > > OpenPGP: 0D69 E11F 12BD BA07 7B37 26AB 4E1F 799A A4FF 2279 >
Re: [basex-talk] Automated Docker Build: basex/basexhttp
The problem was that I wanted to add files to the /webapp directory, but once the Dockerfile defines it as volume you can't modify it as part of the container definition, you can only mount to it from outside. My general approach is to have two images, one without the volumes, for use in defining derived containers, and a working image that defines the volumes so they are ready for mounting from other containers or from the host. I suppose another way to handle it is to not define the volumes at all and then do it in Docker-compose scripts (I'm using docker-compose to manage all my related containers and provide a convenient way to manage everything through one command). Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 2/19/16, 4:27 PM, "Jens Erat" <jens.e...@uni-konstanz.de> wrote: >Hi Eliot, > >> Looks like this version addresses the issue I had >> (creating volumes in the base container made it impossible to add repos >>or >> webapps in using Dockerfiles). > >that's also one of the issues the earlier Dockerfile of Dirk and Michael >had -- and I also had issues with that when I started with Docker. It's >somewhat counter-intuitive that you have to `EXPOSE` ports to publish >them, but not define `VOLUME`s to bind them. > >Defining a volume in a Dockerfile results in a bind-mount of a new, >empty folder (if not defined otherwise through `--volume[-from]`) during >container instanciation. So you can very well add files into the image >-- but they're hidden by that bind mount in the running container. > >Regards, >Jens > > >-- >Jens Erat > > [phone]: tel:+49-151-56961126 > [mail]: mailto:em...@jenserat.de >[jabber]: xmpp:jab...@jenserat.de > [web]: http://www.jenserat.de > > OpenPGP: 0D69 E11F 12BD BA07 7B37 26AB 4E1F 799A A4FF 2279 >
[basex-talk] Odd Behavior from Ruby Client
I have my Ruby-based server-side commit hooks working but now I'm seeing a failure result I can't figure out yet. I have a large sample data set that I'm using for testing (https://hub.docker.com/r/ditacommunity/demo-content/). This content is non-trivial DITA and all the files are valid (e.g., Oxygen validates the maps and topics). However, when I load the files via my Ruby script that uses the BaseXClient.rb file, a few of the files fail to load with the error indicating that something in the BaseX client/server chain is trying to interpret the XML data as a file path. As far as I can tell it's the same documents every time, so I suspect there is something about the documents' details that is causing this. I guess my question is: what could cause this behavior and how might I go about debugging and correcting it? The Ruby script is here: https://github.com/dita-for-small-teams/dfst-git-commit-hooks/blob/develop/ server-side/post-receive I probably have to rewrite the script to use the REST API so that I have more control over the encoding details (per earlier discussion of how to handle UTF-16 documents) but I'd still like to what's causing this as it points up to either a really subtle user error or a flaw in the basic API mechanism. Cheers, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Catalog Resolution Under Windows
This code would fail to construct a good Windows file: URL because it misses out the "/" after the "file:". Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/14/16, 5:02 AM, "Kendall Shaw" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of kendall.s...@workday.com> wrote: >I remember trying to figure out if the catalog files are supposed to be >URIs or file paths (in commons resolver), and I think it was >Catalog.parseCatalogFile that had a problem. It tries essentially: > >catalogPath = path.replace(‘\\', ‘/‘) > >new URL(baseURL, catalogPath) > >and if that is a malformed URL, it tries: > >new URL(“file:”, catalogPath) > >So, with a canonical windows file path, the URL that it uses is: > >file:c:/somewhere/some.file > > > > >And then later somewhere that URL is not resolved. > >Kendall > >On 3/13/16, 12:27 AM, "basex-talk-boun...@mailman.uni-konstanz.de on >behalf of Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on >behalf of ekim...@contrext.com> wrote: > >>I was just trying that now (the trailing quote was my typo but was also a >>red herring). >> >>Yes, the value must be a URL and I've verified that using file:/c:/... >>works. >> >>I'm trying to put together a code patch that does this automatically when >>a normal filename is specified because this is pretty bad Simon Says >>behavior. It doesn't help that the Javadoc for the Apache CatalogManager >>isn't itself explicit that the catalog files property is in fact URLs, >>not >>OS-specific file paths. >> >>Cheers, >> >>E. >> >> >>Eliot Kimber, Owner >>Contrext, LLC >>http://contrext.com >> >> >> >> >>On 3/13/16, 5:19 PM, "Imsieke, Gerrit, le-tex" >><basex-talk-boun...@mailman.uni-konstanz.de on behalf of >>gerrit.imsi...@le-tex.de> wrote: >> >>>Hi Eliot, >>> >>>I didn¹t recently try it on Windows myself, but just two observations. >>> >>>On 13.03.2016 01:13, Eliot Kimber wrote: >>>> CATFILE = C:/workspace/DITA-OT2.x/catalog-dita.xml" >>> >>>There is a trailing quote sign here, is this intentional? Don¹t know the >>>effects of unbalanced quotes here. >>> >>>In any case, it might be necessary to give the location as a file: URI, >>>as in file:///C:/workspace/DITA-OT2.x/catalog-dita.xml >>>Did you already try that? >>> >>>Gerrit >>> >> >>
Re: [basex-talk] Catalog Resolution Under Windows
I tried the same experiment on Windows 10 and got the same failure result as on Windows 7. Java is: C:\WINDOWS\system32>java -version java version "1.8.0_66" Java(TM) SE Runtime Environment (build 1.8.0_66-b18) Java HotSpot(TM) 64-Bit Server VM (build 25.66-b18, mixed mode) Cheers, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/13/16, 9:05 AM, "Eliot Kimber" <ekim...@contrext.com> wrote: >I tried some more experiments. > >I used the BaseX GUI as follows: > >1. Created a new database and used the GUI to select the catalog file >directly. >2. Used the add function from the New Database dialog to load a directory, >selecting all the .xml, .dita, and .ditamap files. > >All the DITA files were skipped. > >I did the same test using 8.4.1 under OS X but against the same source >files and catalog and it worked fine. > >So this seems to be a general issue with catalog resolution under Windows. > >Cheers, > >E. > >Eliot Kimber, Owner >Contrext, LLC >http://contrext.com > > > > >On 3/12/16, 6:11 PM, "Eliot Kimber" ><basex-talk-boun...@mailman.uni-konstanz.de on behalf of >ekim...@contrext.com> wrote: > >>I'm trying to make BaseX work under Windows 7 and I don't seem to be able >>to get catalog resolution to work. (I'm doing a workshop in Japan and the >>classroom only has 32-bit Windows machines available--since Docker >>requires 64-bit Windows I'm having to scramble to make the same code work >>directly under Windows 7 32-bit--ugh.) >> >>I'm using BaseX 8.4.1 with Java 8 (the Java supplied with the 32-bit >>version of oXygenXML). >> >>In my .basex file I have these entries: >> >>CATFILE = C:/workspace/DITA-OT2.x/catalog-dita.xml" >>DTD = true >>SKIPCORRUPT = true >>CHOP = false >> >> >>Using the DBA Web app I can see that the CATFILE property is set to that >>value, DTD is checked, CHOP is unchecked, and SKIPCORRUPT is checked, so >>my settings are clearly being used. >> >>However, if I create a database and use the DBA app to load a document >>that uses a DTD mapped by the catalog (e.g., a DITA document), load fails >>with a "Can't resolve DTD" message. >> >>The document is valid according to oXygen (and it's the same >>catalog--this >>is the OT oXygen is using) and of course my OS X and Docker-based >>versions >>of the same setup work fine, so it looks like a Windows-specific issue. >> >>Is there any known issue with catalog resolution under Windows? Is there >>anything I can do to try to debug the problem? >> >>Thanks, >> >>Eliot >> >> >>Eliot Kimber, Owner >>Contrext, LLC >>http://contrext.com >> >> >> >> > > >
Re: [basex-talk] Catalog Resolution Under Windows
I tried some more experiments. I used the BaseX GUI as follows: 1. Created a new database and used the GUI to select the catalog file directly. 2. Used the add function from the New Database dialog to load a directory, selecting all the .xml, .dita, and .ditamap files. All the DITA files were skipped. I did the same test using 8.4.1 under OS X but against the same source files and catalog and it worked fine. So this seems to be a general issue with catalog resolution under Windows. Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/12/16, 6:11 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: >I'm trying to make BaseX work under Windows 7 and I don't seem to be able >to get catalog resolution to work. (I'm doing a workshop in Japan and the >classroom only has 32-bit Windows machines available--since Docker >requires 64-bit Windows I'm having to scramble to make the same code work >directly under Windows 7 32-bit--ugh.) > >I'm using BaseX 8.4.1 with Java 8 (the Java supplied with the 32-bit >version of oXygenXML). > >In my .basex file I have these entries: > >CATFILE = C:/workspace/DITA-OT2.x/catalog-dita.xml" >DTD = true >SKIPCORRUPT = true >CHOP = false > > >Using the DBA Web app I can see that the CATFILE property is set to that >value, DTD is checked, CHOP is unchecked, and SKIPCORRUPT is checked, so >my settings are clearly being used. > >However, if I create a database and use the DBA app to load a document >that uses a DTD mapped by the catalog (e.g., a DITA document), load fails >with a "Can't resolve DTD" message. > >The document is valid according to oXygen (and it's the same catalog--this >is the OT oXygen is using) and of course my OS X and Docker-based versions >of the same setup work fine, so it looks like a Windows-specific issue. > >Is there any known issue with catalog resolution under Windows? Is there >anything I can do to try to debug the problem? > >Thanks, > >Eliot > > >Eliot Kimber, Owner >Contrext, LLC >http://contrext.com > > > >
[basex-talk] Catalog Resolution Under Windows
I'm trying to make BaseX work under Windows 7 and I don't seem to be able to get catalog resolution to work. (I'm doing a workshop in Japan and the classroom only has 32-bit Windows machines available--since Docker requires 64-bit Windows I'm having to scramble to make the same code work directly under Windows 7 32-bit--ugh.) I'm using BaseX 8.4.1 with Java 8 (the Java supplied with the 32-bit version of oXygenXML). In my .basex file I have these entries: CATFILE = C:/workspace/DITA-OT2.x/catalog-dita.xml" DTD = true SKIPCORRUPT = true CHOP = false Using the DBA Web app I can see that the CATFILE property is set to that value, DTD is checked, CHOP is unchecked, and SKIPCORRUPT is checked, so my settings are clearly being used. However, if I create a database and use the DBA app to load a document that uses a DTD mapped by the catalog (e.g., a DITA document), load fails with a "Can't resolve DTD" message. The document is valid according to oXygen (and it's the same catalog--this is the OT oXygen is using) and of course my OS X and Docker-based versions of the same setup work fine, so it looks like a Windows-specific issue. Is there any known issue with catalog resolution under Windows? Is there anything I can do to try to debug the problem? Thanks, Eliot ---- Eliot Kimber, Owner Contrext, LLC http://contrext.com
Re: [basex-talk] Catalog Resolution Under Windows
I was just trying that now (the trailing quote was my typo but was also a red herring). Yes, the value must be a URL and I've verified that using file:/c:/... works. I'm trying to put together a code patch that does this automatically when a normal filename is specified because this is pretty bad Simon Says behavior. It doesn't help that the Javadoc for the Apache CatalogManager isn't itself explicit that the catalog files property is in fact URLs, not OS-specific file paths. Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/13/16, 5:19 PM, "Imsieke, Gerrit, le-tex" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of gerrit.imsi...@le-tex.de> wrote: >Hi Eliot, > >I didn¹t recently try it on Windows myself, but just two observations. > >On 13.03.2016 01:13, Eliot Kimber wrote: >> CATFILE = C:/workspace/DITA-OT2.x/catalog-dita.xml" > >There is a trailing quote sign here, is this intentional? Don¹t know the >effects of unbalanced quotes here. > >In any case, it might be necessary to give the location as a file: URI, >as in file:///C:/workspace/DITA-OT2.x/catalog-dita.xml >Did you already try that? > >Gerrit >
Re: [basex-talk] XLSX to XML
I've written some XSLT to handle some of XSLX in the context of my Word-to-DITA transformation framework: https://github.com/dita4publishers/org.dita4publishers.word2dita/tree/devel op/xsl The business logic should be relatively easy to recode as Xquery if appropriate. In addition, the Apache POI project provides pretty robust Java code reading and writing Office documents including Excel files. It should be relatively easy to use it from BaseX as a Java extension. Cheers, Eliot Eliot Kimber, Owner Contrext, LLC http://contrext.com On 4/11/16, 6:00 AM, "Florian Eckey" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of florian.ec...@googlemail.com> wrote: >Hi Christian, > >I have only seen that there exists a function in MarkLogic for reading >excel files or especially convert excel to xhtml, but it looks like (as >in the example) MarkLogic can only process xls files, not xlsx. >https://docs.marklogic.com/xdmp.excelConvert > >Looks like it does not support updates. >As soon as I get more experience, i will share it here. > >Best, >Florian > > > > >Am 06.04.16, 12:48 schrieb "Christian Grün" <christian.gr...@gmail.com>: > >>Hi Florian, >> >>Out of interest: Could you tell us a little about your experiences >>with the MarkLogic Excel module? Does it e.g. support updates as well? >> >>Thanks in advance, >>Christian >> >> >> >>On Wed, Apr 6, 2016 at 12:43 PM, Dirk Kirsten <d...@basex.org> wrote: >>> Hello Florian, >>> >>> please remember to always include the list when replying as it allows >>> others to benefit from our exchange as well and also allows others to >>> help you. >>> >>> I just want to point out, again, that you it doesn't make sense to say >>> "convert the excel file to xml", because it already is XML. Yes, there >>> might be multiple XML files and they reference each other, but this is >>> just a very normal thing for XML and for every reasonably complex >>>system >>> to reference each other. >>> >>> So I guess what you really want is an XQuery module which allows you to >>> easily manipulate xlsx files without the need to worry about internal >>> ooxml format stuff like shared strings. This if course makes a lot of >>> sense! However, as the format is ridiculously complicated it is a hard >>> task to write a general-purpose library for all kinds of manipulations. >>> As Christian indicated we wrote for ourself some small helpers >>>functions >>> which dies the stuff we need in our projects, but is very far from >>>being >>> complete on the xlsx standard. >>> >>> Cheers >>> Dirk >>> >>> On 04/06/2016 12:35 PM, Florian Eckey wrote: >>>> Hello Dirk, >>>> >>>> thanks. That was my idea as well. But the format xlsx is really >>>>complicated, because the content (sheet01.xml) in the cells is >>>>referenced to an other document (stringValues.xml) using an index. I >>>>guess anyone has implemented a simple xquery to convert the excel file >>>>to xml? >>>> But if nobody has done that before, i have to spend time for the >>>>implementation on my own. :) >>>> >>>> Thanks, best, >>>> Florian >>>> >>>> >>>> >>>> >>>> Am 06.04.16, 12:26 schrieb "Dirk Kirsten" <d...@basex.org>: >>>> >>>>> Hello Florian, >>>>> >>>>> xlsx is just a zip file containing many xml files. you can simply >>>>>unzip >>>>> the xlsx (e.g. by using the BaseX zip module), modify the xml files >>>>> inside using standard XQuery and rezip it again as xslx. >>>>> >>>>> Cheers >>>>> Dirk >>>>> >>>>> On 04/06/2016 12:18 PM, Florian Eckey wrote: >>>>>> Hi guys, >>>>>> >>>>>> are there any ideas how to convert excel's xlsx (not xls) files to >>>>>>xml >>>>>> with XQuery or to use a Java library which can be imported? It looks >>>>>> like BaseX has no internal functions as for instance MarkLogic. >>>>>> >>>>>> Any ideas or example implementations to do that in XQuery or Java? >>>>>> >>>>>> Best, >>>>>> Florian >>>>> -- >>>>> Dirk Kirsten, BaseX GmbH, http://basexgmbh.de >>>>> |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz >>>>> |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: >>>>> | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle >>>>> `-- Phone: 0049 7531 91 68 276, Fax: 0049 7531 20 05 22 >>>>> >>> >>> -- >>> Dirk Kirsten, BaseX GmbH, http://basexgmbh.de >>> |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz >>> |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: >>> | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle >>> `-- Phone: 0049 7531 91 68 276, Fax: 0049 7531 20 05 22 >>> > >
Re: [basex-talk] Catalog Resolution Under Windows
The issue was entirely the syntax of the URL for the catalog file. Once I used "file:/c:/foo/bar/catalog-dita.xml" then it worked reliably. The catalog resolver requires a URL and unless I was looking at the wrong BaseX code there was nothing in there to correct "\" to "/" before passing the value to the catalog manager. Cheers, E. Eliot Kimber, Owner Contrext, LLC http://contrext.com On 3/21/16, 7:46 AM, "Christian Grün" <christian.gr...@gmail.com> wrote: >Hi Eliot, > >I spent some time to find out what might have gone wrong in your >scenario. I created a little, self-contained BaseX command script, >which seems to work out of the box with Windows. Could you please give >it a try and modify it such that I can see what goes wrong? > >You can change the value of the 'catfile' option to an absolute path >with forward or backward slashes, it shouldn¹t make a difference. > >Thanks in advance >Christian > > > > >On Sat, Mar 12, 2016 at 10:11 AM, Eliot Kimber <ekim...@contrext.com> >wrote: >> I'm trying to make BaseX work under Windows 7 and I don't seem to be >>able >> to get catalog resolution to work. (I'm doing a workshop in Japan and >>the >> classroom only has 32-bit Windows machines available--since Docker >> requires 64-bit Windows I'm having to scramble to make the same code >>work >> directly under Windows 7 32-bit--ugh.) >> >> I'm using BaseX 8.4.1 with Java 8 (the Java supplied with the 32-bit >> version of oXygenXML). >> >> In my .basex file I have these entries: >> >> CATFILE = C:/workspace/DITA-OT2.x/catalog-dita.xml" >> DTD = true >> SKIPCORRUPT = true >> CHOP = false >> >> >> Using the DBA Web app I can see that the CATFILE property is set to that >> value, DTD is checked, CHOP is unchecked, and SKIPCORRUPT is checked, so >> my settings are clearly being used. >> >> However, if I create a database and use the DBA app to load a document >> that uses a DTD mapped by the catalog (e.g., a DITA document), load >>fails >> with a "Can't resolve DTD" message. >> >> The document is valid according to oXygen (and it's the same >>catalog--this >> is the OT oXygen is using) and of course my OS X and Docker-based >>versions >> of the same setup work fine, so it looks like a Windows-specific issue. >> >> Is there any known issue with catalog resolution under Windows? Is there >> anything I can do to try to debug the problem? >> >> Thanks, >> >> Eliot >> >> >> Eliot Kimber, Owner >> Contrext, LLC >> http://contrext.com >> >> >>
[basex-talk] Basex Docker Container: How to Control Options
I’m trying to set up a Docker container with my own Web app using the latest official containers as a base and following the instructions in the latest BaseX docs. As part of this setup I need to add several additional options. I also want to include the dba app. Following the instructions in the docs I’m putting my additional options in /srv/.basex and these take effect. However, if I use the basex/dba image as a base, the other options, such as the custom REPOPATH and WEBPATH settings do not get used. It’s not clear to me from looking at the various Dockerfiles how the dba container sets the options and how I can then add to them without overwriting them. Of course I can duplicate the options in my Dockerfile but it seems like I should be able to add my options additively but I’m not seeing how to do it, so I feel I’m missing something. In my Dockerfile I have: COPY basex/dot_basex /srv/.basex If I don’t include this line then the container otherwise works in that my Web app is there but it requires these additional options in order to function properly. If I include it then the effective options reflect only my changes and not those required for the custom Web apps to work. Thanks, Eliot -- Eliot Kimber http://contrext.com
Re: [basex-talk] Basex Docker Container: How to Control Options
I found my original mail thread from January 2016: https://mailman.uni-konstanz.de/pipermail/basex-talk/2016-January/010171.html Basically I had to append my options to the existing .basex file rather than simply provide a file. However, that isn’t an option here because there is no existing /srv/.basex file (the place where the file needs to be). As an experiment I tried doing this: RUN touch /srv/.basex && \ echo "DTD=true" >> .basex && \ echo "CATFILE=/opt/dita-ot/DITA-OT/catalog-dita.xml" >> .basex && \ echo "SKIPCORRUPT=true" >> .basex && \ echo "CHOP=false" >> .basex && \ echo "DBPATH=/srv/BaseXData" >> .basex && \ echo "REPOPATH=/srv/BaseXRepo" >> .basex && \ echo "WEBPATH=/srv/BaseXWeb" >> .basex && \ chown basex /srv/.basex RUN echo "/srv/.basex:\n===" && cat /srv/.basex && echo "===" And I get this from the build command: /srv/.basex: === DTD=true CATFILE=/opt/dita-ot/DITA-OT/catalog-dita.xml SKIPCORRUPT=true CHOP=false DBPATH=/srv/BaseXData REPOPATH=/srv/BaseXRepo WEBPATH=/srv/BaseXWeb === When I start up this container I get this: /srv/.basex: Unknown option 'DTD'. /srv/.basex: Unknown option 'CATFILE'. /srv/.basex: Unknown option 'SKIPCORRUPT'. /srv/.basex: Unknown option 'CHOP'. /srv/.basex: writing new configuration file. Note that it’s only complaining about DTD, CATFILE, SKIPCORRUPT, and CHOP, but not DBPATH, REPOPATH, and WEBPATH. I tried reordering things, no difference in result (but the messages reflect the order change). So it must be something about how these particular options are processed rather than an issue with the .basex file, at least as far as I can see. Cheers, Eliot -- Eliot Kimber http://contrext.com On 3/26/17, 3:29 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: Discovered that if I use USER root in the Dockerfile I have to then do USER basex before the end so that the container will be running as basex, otherwise it does not start up correctly. I also have to chown the /srv/.basex file to basex otherwise it can’t overwrite it on startup. Made some progress but something is still not working. I now have a Dockerfile based on basex/dba that results in a running system with the both the dba and my custom app runnable. However, if I then add this: COPY basex/dot_basex /srv/.basex As before, on startup I get this: docker run --name linkmgr-test -p 8984:8984 link-manager-test /srv/.basex: Unknown option 'CATFILE'. /srv/.basex: Unknown option 'DTD'. /srv/.basex: Unknown option 'SKIPCORRUPT'. /srv/.basex: Unknown option 'CHOP'. /srv/.basex: writing new configuration file. Creating /srv/BaseXWeb/WEB-INF/web.xml Creating /srv/BaseXWeb/WEB-INF/jetty.xml I think this Unknown option this is actually an issue with the file itself but I don’t recall what the exact cause and/or solution was. Cheers, E. -- Eliot Kimber http://contrext.com On 3/26/17, 1:40 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: I’m trying to set up a Docker container with my own Web app using the latest official containers as a base and following the instructions in the latest BaseX docs. As part of this setup I need to add several additional options. I also want to include the dba app. Following the instructions in the docs I’m putting my additional options in /srv/.basex and these take effect. However, if I use the basex/dba image as a base, the other options, such as the custom REPOPATH and WEBPATH settings do not get used. It’s not clear to me from looking at the various Dockerfiles how the dba container sets the options and how I can then add to them without overwriting them. Of course I can duplicate the options in my Dockerfile but it seems like I should be able to add my options additively but I’m not seeing how to do it, so I feel I’m missing something. In my Dockerfile I have: COPY basex/dot_basex /srv/.basex If I don’t include this line then the container otherwise works in that my Web app is there but it requires these additional options in order to function properly. If I include it then the effective options reflect only my changes and not those required for the custom Web apps to work. Thanks, Eliot -- Eliot Kimber http://contrext.com
Re: [basex-talk] Basex Docker Container: How to Control Options
Discovered that if I use USER root in the Dockerfile I have to then do USER basex before the end so that the container will be running as basex, otherwise it does not start up correctly. I also have to chown the /srv/.basex file to basex otherwise it can’t overwrite it on startup. Made some progress but something is still not working. I now have a Dockerfile based on basex/dba that results in a running system with the both the dba and my custom app runnable. However, if I then add this: COPY basex/dot_basex /srv/.basex As before, on startup I get this: docker run --name linkmgr-test -p 8984:8984 link-manager-test /srv/.basex: Unknown option 'CATFILE'. /srv/.basex: Unknown option 'DTD'. /srv/.basex: Unknown option 'SKIPCORRUPT'. /srv/.basex: Unknown option 'CHOP'. /srv/.basex: writing new configuration file. Creating /srv/BaseXWeb/WEB-INF/web.xml Creating /srv/BaseXWeb/WEB-INF/jetty.xml I think this Unknown option this is actually an issue with the file itself but I don’t recall what the exact cause and/or solution was. Cheers, E. -- Eliot Kimber http://contrext.com On 3/26/17, 1:40 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: I’m trying to set up a Docker container with my own Web app using the latest official containers as a base and following the instructions in the latest BaseX docs. As part of this setup I need to add several additional options. I also want to include the dba app. Following the instructions in the docs I’m putting my additional options in /srv/.basex and these take effect. However, if I use the basex/dba image as a base, the other options, such as the custom REPOPATH and WEBPATH settings do not get used. It’s not clear to me from looking at the various Dockerfiles how the dba container sets the options and how I can then add to them without overwriting them. Of course I can duplicate the options in my Dockerfile but it seems like I should be able to add my options additively but I’m not seeing how to do it, so I feel I’m missing something. In my Dockerfile I have: COPY basex/dot_basex /srv/.basex If I don’t include this line then the container otherwise works in that my Web app is there but it requires these additional options in order to function properly. If I include it then the effective options reflect only my changes and not those required for the custom Web apps to work. Thanks, Eliot -- Eliot Kimber http://contrext.com
Re: [basex-talk] Possible to Access Environment Variables from XQuery?
Thanks—it never occurred to me this would be a built-in function of XQuery 3. Cheers, E. -- Eliot Kimber http://contrext.com On 3/25/17, 7:05 PM, "Christian Grün" <christian.gr...@gmail.com> wrote: Hi Eliot, fn:environment-variable should do the job; proc:property [1] can be used to access system properties. Hope this helps, Christian [1] http://docs.basex.org/wiki/Process_Module#proc:property On Sat, Mar 25, 2017 at 7:01 PM, Eliot Kimber <ekim...@contrext.com> wrote: > I’m running BaseX in a Docker container started via docker-compose. I need to communicate some local configuration details to BaseX. Can set these as environment variables (e.g., using .env with docker-compose) but I don’t see anything in the docs that would let me access environment variables from XQuery. > > Is there a built-in way to do this? > > Thanks, > > Eliot > > -- > Eliot Kimber > http://contrext.com > > > >
[basex-talk] Possible to Access Environment Variables from XQuery?
I’m running BaseX in a Docker container started via docker-compose. I need to communicate some local configuration details to BaseX. Can set these as environment variables (e.g., using .env with docker-compose) but I don’t see anything in the docs that would let me access environment variables from XQuery. Is there a built-in way to do this? Thanks, Eliot -- Eliot Kimber http://contrext.com
Re: [basex-talk] Basex Docker Container: How to Control Options
Perhaps a more useful question is: is there a workaround for this? That is, I need to have the DTD, CATFILE, CHOP, and SKIPCORRUPT options set when I load files into a database, which I’m doing through the remote API. Is there either a separate options file that I should be using or a way to specify these options at document load time? Cheers, E. -- Eliot Kimber http://contrext.com On 4/1/17, 10:23 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: I dug into the code and I think this is a code bug but I don’t understand the overall flow well enough to know where the bug is, but I suspect it’s a simple change. If I debug the BasexGUI class and have a ~/.basex file that includes CATFILE=foo it will fail with an unknown option error because the option class being used is StaticOptions and StaticOptions does not define CATFILE as an option. The CATFILE option, along with a few others, is defined on MainOptions: private static final Option[] INHERIT = { CHOP, INTPARSE, STRIPNS, DTD, XINCLUDE, CATFILE }; Looking at BaseXHTTP it uses HTTPContext which only uses StaticOptions, not MainOptions, which means it can never allow the INHERIT options. So something must have changed from 8.1 ish to now to change how options are handled since these options worked before I moved to the latest BaseX container and versions. But there definitely seems to be an oversight in the initial handling of the options file in the BaseXHTTP server (and in the BaseXGUI as well since I get the same behavior there as well). I would make a pull request but I don’t presume to know what policies or general practices underlie the current options mechanism deisgn. Cheers, Eliot -- Eliot Kimber http://contrext.com On 3/26/17, 10:20 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: I found my original mail thread from January 2016: https://mailman.uni-konstanz.de/pipermail/basex-talk/2016-January/010171.html Basically I had to append my options to the existing .basex file rather than simply provide a file. However, that isn’t an option here because there is no existing /srv/.basex file (the place where the file needs to be). As an experiment I tried doing this: RUN touch /srv/.basex && \ echo "DTD=true" >> .basex && \ echo "CATFILE=/opt/dita-ot/DITA-OT/catalog-dita.xml" >> .basex && \ echo "SKIPCORRUPT=true" >> .basex && \ echo "CHOP=false" >> .basex && \ echo "DBPATH=/srv/BaseXData" >> .basex && \ echo "REPOPATH=/srv/BaseXRepo" >> .basex && \ echo "WEBPATH=/srv/BaseXWeb" >> .basex && \ chown basex /srv/.basex RUN echo "/srv/.basex:\n===" && cat /srv/.basex && echo "===" And I get this from the build command: /srv/.basex: === DTD=true CATFILE=/opt/dita-ot/DITA-OT/catalog-dita.xml SKIPCORRUPT=true CHOP=false DBPATH=/srv/BaseXData REPOPATH=/srv/BaseXRepo WEBPATH=/srv/BaseXWeb === When I start up this container I get this: /srv/.basex: Unknown option 'DTD'. /srv/.basex: Unknown option 'CATFILE'. /srv/.basex: Unknown option 'SKIPCORRUPT'. /srv/.basex: Unknown option 'CHOP'. /srv/.basex: writing new configuration file. Note that it’s only complaining about DTD, CATFILE, SKIPCORRUPT, and CHOP, but not DBPATH, REPOPATH, and WEBPATH. I tried reordering things, no difference in result (but the messages reflect the order change). So it must be something about how these particular options are processed rather than an issue with the .basex file, at least as far as I can see. Cheers, Eliot -- Eliot Kimber http://contrext.com On 3/26/17, 3:29 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: Discovered that if I use USER root in the Dockerfile I have to then do USER basex before the end so that the container will be running as basex, otherwise it does not start up correctly. I also have to chown the /srv/.basex file to basex otherwise it can’t overwrite it on startup. Made some progress but something is still not w
Re: [basex-talk] Basex Docker Container: How to Control Options
Yes, the options were at the end. But looking at the code I don’t see how that could have mattered. Perhaps I missed a detail? Cheers, E. -- Eliot Kimber http://contrext.com From: Christian Grün <christian.gr...@gmail.com> Date: Sunday, April 2, 2017 at 11:38 AM To: Eliot Kimber <ekim...@contrext.com> Cc: "basex-talk@mailman.uni-konstanz.de" <basex-talk@mailman.uni-konstanz.de> Subject: Re: [basex-talk] Basex Docker Container: How to Control Options Hi Eliot, Did you try to specify the option in the local section of the .basex file (at the bottom, after all other options)? Best, Christian Am 02.04.2017 10:52 schrieb "Eliot Kimber" <ekim...@contrext.com>: I’ve worked around the configuration file problem by setting the options on the Java command line by setting BASEX_JVM in my Dockerfile: ENV BASEX_JVM="-Dorg.basex.CHOP=false -Dorg.basex.CATFILE=/opt/dita-ot/DITA-OT/catalog-dita.xml -Dorg.basex.DTD=true -Dorg.basex.SKIPCORRUPT=true" That seems to work. I guess I could also set this environment variable in my docker-compose.yml file, although except for CATFILE, these are invariant for this application. Cheers, E. -- Eliot Kimber http://contrext.com On 4/1/17, 10:44 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: Perhaps a more useful question is: is there a workaround for this? That is, I need to have the DTD, CATFILE, CHOP, and SKIPCORRUPT options set when I load files into a database, which I’m doing through the remote API. Is there either a separate options file that I should be using or a way to specify these options at document load time? Cheers, E. -- Eliot Kimber http://contrext.com On 4/1/17, 10:23 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: I dug into the code and I think this is a code bug but I don’t understand the overall flow well enough to know where the bug is, but I suspect it’s a simple change. If I debug the BasexGUI class and have a ~/.basex file that includes CATFILE=foo it will fail with an unknown option error because the option class being used is StaticOptions and StaticOptions does not define CATFILE as an option. The CATFILE option, along with a few others, is defined on MainOptions: private static final Option[] INHERIT = { CHOP, INTPARSE, STRIPNS, DTD, XINCLUDE, CATFILE }; Looking at BaseXHTTP it uses HTTPContext which only uses StaticOptions, not MainOptions, which means it can never allow the INHERIT options. So something must have changed from 8.1 ish to now to change how options are handled since these options worked before I moved to the latest BaseX container and versions. But there definitely seems to be an oversight in the initial handling of the options file in the BaseXHTTP server (and in the BaseXGUI as well since I get the same behavior there as well). I would make a pull request but I don’t presume to know what policies or general practices underlie the current options mechanism deisgn. Cheers, Eliot -- Eliot Kimber http://contrext.com On 3/26/17, 10:20 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: I found my original mail thread from January 2016: https://mailman.uni-konstanz.de/pipermail/basex-talk/2016-January/010171.html Basically I had to append my options to the existing .basex file rather than simply provide a file. However, that isn’t an option here because there is no existing /srv/.basex file (the place where the file needs to be). As an experiment I tried doing this: RUN touch /srv/.basex && \ echo "DTD=true" >> .basex && \ echo "CATFILE=/opt/dita-ot/DITA-OT/catalog-dita.xml" >> .basex && \ echo "SKIPCORRUPT=true" >> .basex && \ echo "CHOP=false" >> .basex && \ echo "DBPATH=/srv/BaseXData" >> .basex && \ echo "REPOPATH=/srv/BaseXRepo" >> .basex && \ echo "WEBPATH=/srv/BaseXWeb" >> .basex && \ chown basex /srv/.basex RUN echo "/srv/.basex:\n===" && cat /srv/.basex && echo "===" And I get this from the build command: /srv/.basex: === DTD=true CATFILE=/opt/dita-ot/DITA-OT/catalog-dita.xml SKIPCORRUPT=true
Re: [basex-talk] Basex Docker Container: How to Control Options
I’ve worked around the configuration file problem by setting the options on the Java command line by setting BASEX_JVM in my Dockerfile: ENV BASEX_JVM="-Dorg.basex.CHOP=false -Dorg.basex.CATFILE=/opt/dita-ot/DITA-OT/catalog-dita.xml -Dorg.basex.DTD=true -Dorg.basex.SKIPCORRUPT=true" That seems to work. I guess I could also set this environment variable in my docker-compose.yml file, although except for CATFILE, these are invariant for this application. Cheers, E. -- Eliot Kimber http://contrext.com On 4/1/17, 10:44 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: Perhaps a more useful question is: is there a workaround for this? That is, I need to have the DTD, CATFILE, CHOP, and SKIPCORRUPT options set when I load files into a database, which I’m doing through the remote API. Is there either a separate options file that I should be using or a way to specify these options at document load time? Cheers, E. -- Eliot Kimber http://contrext.com On 4/1/17, 10:23 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: I dug into the code and I think this is a code bug but I don’t understand the overall flow well enough to know where the bug is, but I suspect it’s a simple change. If I debug the BasexGUI class and have a ~/.basex file that includes CATFILE=foo it will fail with an unknown option error because the option class being used is StaticOptions and StaticOptions does not define CATFILE as an option. The CATFILE option, along with a few others, is defined on MainOptions: private static final Option[] INHERIT = { CHOP, INTPARSE, STRIPNS, DTD, XINCLUDE, CATFILE }; Looking at BaseXHTTP it uses HTTPContext which only uses StaticOptions, not MainOptions, which means it can never allow the INHERIT options. So something must have changed from 8.1 ish to now to change how options are handled since these options worked before I moved to the latest BaseX container and versions. But there definitely seems to be an oversight in the initial handling of the options file in the BaseXHTTP server (and in the BaseXGUI as well since I get the same behavior there as well). I would make a pull request but I don’t presume to know what policies or general practices underlie the current options mechanism deisgn. Cheers, Eliot -- Eliot Kimber http://contrext.com On 3/26/17, 10:20 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: I found my original mail thread from January 2016: https://mailman.uni-konstanz.de/pipermail/basex-talk/2016-January/010171.html Basically I had to append my options to the existing .basex file rather than simply provide a file. However, that isn’t an option here because there is no existing /srv/.basex file (the place where the file needs to be). As an experiment I tried doing this: RUN touch /srv/.basex && \ echo "DTD=true" >> .basex && \ echo "CATFILE=/opt/dita-ot/DITA-OT/catalog-dita.xml" >> .basex && \ echo "SKIPCORRUPT=true" >> .basex && \ echo "CHOP=false" >> .basex && \ echo "DBPATH=/srv/BaseXData" >> .basex && \ echo "REPOPATH=/srv/BaseXRepo" >> .basex && \ echo "WEBPATH=/srv/BaseXWeb" >> .basex && \ chown basex /srv/.basex RUN echo "/srv/.basex:\n===" && cat /srv/.basex && echo "===" And I get this from the build command: /srv/.basex: === DTD=true CATFILE=/opt/dita-ot/DITA-OT/catalog-dita.xml SKIPCORRUPT=true CHOP=false DBPATH=/srv/BaseXData REPOPATH=/srv/BaseXRepo WEBPATH=/srv/BaseXWeb === When I start up this container I get this: /srv/.basex: Unknown option 'DTD'. /srv/.basex: Unknown option 'CATFILE'. /srv/.basex: Unknown option 'SKIPCORRUPT'. /srv/.basex: Unknown option 'CHOP'. /srv/.basex: writing new configuration file. Note that it’s only compl
Re: [basex-talk] Basex Docker Container: How to Control Options
Are you really saying that the “# Local options” comment is a necessary part of the configuration file? Really? Because I don’t think a comment should ever a required part of any syntax… If the configuration file needs have labeled components use YAML or XML or something. Cheers, e. -- Eliot Kimber http://contrext.com On 4/2/17, 2:25 PM, "Christian Grün" <christian.gr...@gmail.com> wrote: Hi Eliot, If you start with a new version of BaseX, in which the .basex configuration file is still empty or non-existent, there may be no "Local Options" comment. I hope the revised Wiki makes this more obvious [1]. Best, Christian [1] http://docs.basex.org/wiki/Options On Sun, Apr 2, 2017 at 12:45 PM, Eliot Kimber <ekim...@contrext.com> wrote: > Yes, the options were at the end. But looking at the code I don’t see how > that could have mattered. Perhaps I missed a detail? > > > > Cheers, > > > > E. > > > > -- > > Eliot Kimber > > http://contrext.com > > > > > > > > From: Christian Grün <christian.gr...@gmail.com> > Date: Sunday, April 2, 2017 at 11:38 AM > To: Eliot Kimber <ekim...@contrext.com> > Cc: "basex-talk@mailman.uni-konstanz.de" > <basex-talk@mailman.uni-konstanz.de> > Subject: Re: [basex-talk] Basex Docker Container: How to Control Options > > > > Hi Eliot, > > > > Did you try to specify the option in the local section of the .basex file > (at the bottom, after all other options)? > > > > Best, > > Christian > > > > > > > > > > > > Am 02.04.2017 10:52 schrieb "Eliot Kimber" <ekim...@contrext.com>: > > I’ve worked around the configuration file problem by setting the options on > the Java command line by setting BASEX_JVM in my Dockerfile: > > ENV BASEX_JVM="-Dorg.basex.CHOP=false > -Dorg.basex.CATFILE=/opt/dita-ot/DITA-OT/catalog-dita.xml > -Dorg.basex.DTD=true -Dorg.basex.SKIPCORRUPT=true" > > That seems to work. I guess I could also set this environment variable in my > docker-compose.yml file, although except for CATFILE, these are invariant > for this application. > > > Cheers, > > E. > > -- > Eliot Kimber > http://contrext.com > > > On 4/1/17, 10:44 PM, "Eliot Kimber" > <basex-talk-boun...@mailman.uni-konstanz.de on behalf of > ekim...@contrext.com> wrote: > > Perhaps a more useful question is: is there a workaround for this? > > That is, I need to have the DTD, CATFILE, CHOP, and SKIPCORRUPT options > set when I load files into a database, which I’m doing through the remote > API. Is there either a separate options file that I should be using or a way > to specify these options at document load time? > > Cheers, > > E. > > -- > Eliot Kimber > http://contrext.com > > > > On 4/1/17, 10:23 PM, "Eliot Kimber" > <basex-talk-boun...@mailman.uni-konstanz.de on behalf of > ekim...@contrext.com> wrote: > > I dug into the code and I think this is a code bug but I don’t > understand the overall flow well enough to know where the bug is, but I > suspect it’s a simple change. > > If I debug the BasexGUI class and have a ~/.basex file that includes > CATFILE=foo it will fail with an unknown option error because the option > class being used is StaticOptions and StaticOptions does not define CATFILE > as an option. > > The CATFILE option, along with a few others, is defined on > MainOptions: > > private static final Option[] INHERIT = { CHOP, INTPARSE, > STRIPNS, DTD, XINCLUDE, CATFILE }; > > Looking at BaseXHTTP it uses HTTPContext which only uses > StaticOptions, not MainOptions, which means it can never allow the INHERIT > options. > > So something must have changed from 8.1 ish to now to change how > options are handled since these options worked before I moved to the latest > BaseX container and versions. > > But there definitely seems to be an oversight in the initial > handling of the options file in the BaseXHTTP server (and in the B
Re: [basex-talk] Basex Docker Container: How to Control Options
I dug into the code and I think this is a code bug but I don’t understand the overall flow well enough to know where the bug is, but I suspect it’s a simple change. If I debug the BasexGUI class and have a ~/.basex file that includes CATFILE=foo it will fail with an unknown option error because the option class being used is StaticOptions and StaticOptions does not define CATFILE as an option. The CATFILE option, along with a few others, is defined on MainOptions: private static final Option[] INHERIT = { CHOP, INTPARSE, STRIPNS, DTD, XINCLUDE, CATFILE }; Looking at BaseXHTTP it uses HTTPContext which only uses StaticOptions, not MainOptions, which means it can never allow the INHERIT options. So something must have changed from 8.1 ish to now to change how options are handled since these options worked before I moved to the latest BaseX container and versions. But there definitely seems to be an oversight in the initial handling of the options file in the BaseXHTTP server (and in the BaseXGUI as well since I get the same behavior there as well). I would make a pull request but I don’t presume to know what policies or general practices underlie the current options mechanism deisgn. Cheers, Eliot -- Eliot Kimber http://contrext.com On 3/26/17, 10:20 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: I found my original mail thread from January 2016: https://mailman.uni-konstanz.de/pipermail/basex-talk/2016-January/010171.html Basically I had to append my options to the existing .basex file rather than simply provide a file. However, that isn’t an option here because there is no existing /srv/.basex file (the place where the file needs to be). As an experiment I tried doing this: RUN touch /srv/.basex && \ echo "DTD=true" >> .basex && \ echo "CATFILE=/opt/dita-ot/DITA-OT/catalog-dita.xml" >> .basex && \ echo "SKIPCORRUPT=true" >> .basex && \ echo "CHOP=false" >> .basex && \ echo "DBPATH=/srv/BaseXData" >> .basex && \ echo "REPOPATH=/srv/BaseXRepo" >> .basex && \ echo "WEBPATH=/srv/BaseXWeb" >> .basex && \ chown basex /srv/.basex RUN echo "/srv/.basex:\n===" && cat /srv/.basex && echo "===" And I get this from the build command: /srv/.basex: === DTD=true CATFILE=/opt/dita-ot/DITA-OT/catalog-dita.xml SKIPCORRUPT=true CHOP=false DBPATH=/srv/BaseXData REPOPATH=/srv/BaseXRepo WEBPATH=/srv/BaseXWeb === When I start up this container I get this: /srv/.basex: Unknown option 'DTD'. /srv/.basex: Unknown option 'CATFILE'. /srv/.basex: Unknown option 'SKIPCORRUPT'. /srv/.basex: Unknown option 'CHOP'. /srv/.basex: writing new configuration file. Note that it’s only complaining about DTD, CATFILE, SKIPCORRUPT, and CHOP, but not DBPATH, REPOPATH, and WEBPATH. I tried reordering things, no difference in result (but the messages reflect the order change). So it must be something about how these particular options are processed rather than an issue with the .basex file, at least as far as I can see. Cheers, Eliot -- Eliot Kimber http://contrext.com On 3/26/17, 3:29 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: Discovered that if I use USER root in the Dockerfile I have to then do USER basex before the end so that the container will be running as basex, otherwise it does not start up correctly. I also have to chown the /srv/.basex file to basex otherwise it can’t overwrite it on startup. Made some progress but something is still not working. I now have a Dockerfile based on basex/dba that results in a running system with the both the dba and my custom app runnable. However, if I then add this: COPY basex/dot_basex /srv/.basex As before, on startup I get this: docker run --name linkmgr-test -p 8984:8984 link-manager-test /srv/.basex: Unknown option 'CATFILE'. /srv/.basex: Unknown option 'DTD'. /srv/.basex: Unknown option 'SKIPCORRUPT'. /srv/.basex: Unknown option 'CHOP'. /srv/.basex: writing new configuration file. Creating /srv/BaseXWeb/WEB-INF/web.xml Creating /srv/BaseXWeb/WEB-INF/jetty.xml I think this Unknown option this is actually an issue with the file itself but I don’t recall what the exact cause and/or solution was.
[basex-talk] Approximating ML's Reverse Queries Mechanism?
I’m working on a project that currently depends on MarkLogic’s reverse query mechanism. This is a feature whereby you store documents that contain MarkLogic-specific queries (queries in their cts namespace). These get indexed in some way (no idea how the index works). You can then search for these stored queries in the context of specific nodes and ML will return those query documents that would match the specified nodes. Their driving use case is alerting-type applications where when new docs get added you see what queries they apply to and then use those queries to do something. My use case is classification: given a node to be classified, find all queries that match it and from the query get the classification details (preferred term, variant forms, associated taxonomy, etc.). This process definitely depends on MarkLogic’s full-text search features, for example to match any form of a non-preferred term to a full-text search that would match it. I have a large corpus to classify (approximately 45 million objects at current count). The processing is inherently parallelizable so I’m looking at non-ML options that would allow us to scale to the limits of our hardware budget. Even if each node was less efficient than ML we would be able to implement massive throughput for this classification operation. In theory we could scale to one processor per object if budget were unlimited (it is not unlimited), so even a brute-force solution would perform well at larger scales. So I guess I have two questions really: 1. Can anyone share ways they use BaseX for doing classification in general? I’ve so far just been focused on analyzing the current system to find performance bottlenecks so I haven’t yet had a chance to think through the classification process in general, but there must be well-understood strategies. I suspect that one could build the equivalent of ML’s reverse query index in BaseX. 2. Is there a direct equivalent to ML’s reverse query facility in BaseX or an obvious route to building one? Thanks, Eliot -- Eliot Kimber http://contrext.com
[basex-talk] Test Failure From Current Source, Failure to Run GUI from Eclipse
I thought I would see if I could add the Xerces grammar caching to BaseX, at least to see if it improved things for DITA loading. I've updated my fork of the basex project to the current version in github. Using the master branch as the basis for my local feature branch and with no modified files, I get one failing test from "mvn test": Failed tests: FnTest.sum:91->AdvancedQueryTest.error:78 Query did not fail: sum(1, 'x') [E] Error: err:FORG0006 [F] 1 Tests run: 1578, Failures: 1, Errors: 0, Skipped: 5 I'm also not able to run the BaseXGUI class using an Eclipse run configuration per the documentation on the BaseX site. I get A bunch of messages about things missing from English.lang: /lang/English.lang not found. English.lang: 'port' is missing ... lots more English.lang: 'h_no_html_parser' is missing Then this fatal error: Image not found: /img/text_xml.png at org.basex.util.Util.stack(Util.java:224) at org.basex.gui.layout.BaseXImages.url(BaseXImages.java:125) at org.basex.gui.layout.BaseXImages.get(BaseXImages.java:62) at org.basex.gui.layout.BaseXImages.icon(BaseXImages.java:109) at org.basex.gui.layout.BaseXImages.(BaseXImages.java:34) at org.basex.gui.GUIMacOSX.addDockIcon(GUIMacOSX.java:84) at org.basex.gui.GUIMacOSX.(GUIMacOSX.java:60) at org.basex.BaseXGUI.(BaseXGUI.java:58) at org.basex.BaseXGUI.main(BaseXGUI.java:39) Exception in thread "main" java.lang.ExceptionInInitializerError at org.basex.gui.GUIMacOSX.addDockIcon(GUIMacOSX.java:84) at org.basex.gui.GUIMacOSX.(GUIMacOSX.java:60) at org.basex.BaseXGUI.(BaseXGUI.java:58) at org.basex.BaseXGUI.main(BaseXGUI.java:39) Caused by: java.lang.IllegalArgumentException: input == null! at javax.imageio.ImageIO.read(ImageIO.java:1388) at org.basex.gui.layout.BaseXImages.get(BaseXImages.java:72) at org.basex.gui.layout.BaseXImages.get(BaseXImages.java:62) at org.basex.gui.layout.BaseXImages.icon(BaseXImages.java:109) at org.basex.gui.layout.BaseXImages.(BaseXImages.java:34) ... 4 more I suspect it's something very simple but no idea what it might be. Thanks, Eliot -- Eliot Kimber http://contrext.com
Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI
Yes, I wouldn't expect the grammars to chew up gigabytes. I'll provide a test data set for you. Cheers, E. -- Eliot Kimber http://contrext.com On 5/14/18, 12:45 PM, "Christian Grün" <christian.gr...@gmail.com> wrote: I would have expected some MBs to be sufficient for parsing even complex DTDs if nothing is cached (but caching could definitely speed up processing), so maybe there’s still something that we could have a look at. If you are interested, feel free to provide me with your files via a private message. On Mon, May 14, 2018 at 7:40 PM, Eliot Kimber <ekim...@contrext.com> wrote: > Yes, I would want caching on by default with the option to turn it off. I'm assuming it's currently not turned on (but to be honest I haven't taken the time to check the source code). > > Certainly for DITA content grammar caching is the only practical way to parse a large number of topics in the same JVM without both using lots of memory and eating an avoidable processing cost of re-processing the grammar files again for each document. > > DITA is probably somewhat unique in this regard because it takes a such a different approach to grammar organization and use than pretty much any other XML application. > > Cheers, > > E.
Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI
Yes, I would want caching on by default with the option to turn it off. I'm assuming it's currently not turned on (but to be honest I haven't taken the time to check the source code). Certainly for DITA content grammar caching is the only practical way to parse a large number of topics in the same JVM without both using lots of memory and eating an avoidable processing cost of re-processing the grammar files again for each document. DITA is probably somewhat unique in this regard because it takes a such a different approach to grammar organization and use than pretty much any other XML application. Cheers, E. -- Eliot Kimber http://contrext.com On 5/14/18, 12:17 PM, "Christian Grün" <christian.gr...@gmail.com> wrote: Hi Eliot, Thanks for your observations. > I think the solution is to turn on Xerces' grammar caching. I’m wondering what is happening here. Did you want to say that caching is enabled by default, and that it should be possible to turn it off? Cheers, Christian The only danger there is that different DTDs within the same content set can different expansions for the same external parameter entity reference (e.g., MathML DTDs), which then can lead to validation issues. For this reason the DITA OT makes use of the grammar cache switchable but on by default. > > Another option for DITA content in particular is to use the OT's preprocessing to parse all the docs and then use BaseX with the parsed docs where all the attributes have been expanded into the source. > > Cheers, > > E. > -- > Eliot Kimber > http://contrext.com > > > On 5/4/18, 9:52 AM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: > > Follow up--I tried giving BaseX the full 16GB of RAM and it still ultimately locked up with the memory meter showing 13GB. > > I'm thinking this must be some kind of memory leak. > > I tried importing the DITA Open Toolkit's documentation source and that worked fine with the max memory being about 2.5GB, but it's only about 250 topics. > > Cheers, > > E. > > -- > Eliot Kimber > http://contrext.com > > On 5/3/18, 4:59 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: > > In the context of trying to do fun things with DITA docs in BaseX I downloaded the latest BaseX (9.0.1) and tried creating a new database and loading docs into it using the BaseX GUI. This is on macOS 10.13.4 with 16GB of hardware RAM available. > > My corpus is about 4000 DITA topics totaling about 30MB on disk. They are all in a single directory (not my decision) if that matters. > > Using the "parse DTDs" option and default indexing options (no token or full text indexes) I'm finding that even with 12GB of RAM allocated to the JVM the memory usage during load will eventually go to 12GB, at which point the processing appears to stop (that is, whatever I set the max memory to, when it's reached, things stop but I only got out of memory errors when I had much lower settings, like the default 2GB). > > I'm currently running a test with 14GB allocated and it is continuing but it does go to 12GB occasionally (watching the memory display on the Add progress panel). > > No individual file is that big--the biggest is 150K and typical is 30K or smaller. > > I wouldn't expect BaseX to have this kind of memory problem so I'm wondering if maybe there's an issue with memory on macOS or with DITA documents in particular (the DITA DTDs are notoriously large)? > > Should I expect BaseX to be able to load this kind of corpus with 14GB of RAM? > > Cheers, > > E. > -- > Eliot Kimber > http://contrext.com > > > > > > > > > > >
Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI
Hmm. In the process of testing my test data set I can't reproduce the earlier behavior. In my current tests, using the same data and the same BaseX version, I get a maximum of maybe 1GB for the largest file but just a few hundred MBs once everything is loaded. For 3800 topics of roughly 50K each (on average) it takes just a couple of seconds to load them with no DTDs, a minute or so with DTDs, which is consistent with the time cost of reparsing the (large) DITA grammars for each topic. So not sure what was happening when I tried this before but I definitely rebooted and installed macOS updates since then, so could have been some Java issue or who knows what else. The good news is that even without grammar caching the DITA topics do load in a reasonable (if not ideal) amount of time and with appropriate memory usage. Cheers, E. -- Eliot Kimber http://contrext.com On 5/14/18, 12:53 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: Yes, I wouldn't expect the grammars to chew up gigabytes. I'll provide a test data set for you. Cheers, E. -- Eliot Kimber http://contrext.com On 5/14/18, 12:45 PM, "Christian Grün" <christian.gr...@gmail.com> wrote: I would have expected some MBs to be sufficient for parsing even complex DTDs if nothing is cached (but caching could definitely speed up processing), so maybe there’s still something that we could have a look at. If you are interested, feel free to provide me with your files via a private message. On Mon, May 14, 2018 at 7:40 PM, Eliot Kimber <ekim...@contrext.com> wrote: > Yes, I would want caching on by default with the option to turn it off. I'm assuming it's currently not turned on (but to be honest I haven't taken the time to check the source code). > > Certainly for DITA content grammar caching is the only practical way to parse a large number of topics in the same JVM without both using lots of memory and eating an avoidable processing cost of re-processing the grammar files again for each document. > > DITA is probably somewhat unique in this regard because it takes a such a different approach to grammar organization and use than pretty much any other XML application. > > Cheers, > > E.
Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI
More experimentation indicates that the issue is the DTDs--if I load the same content without DTD parsing then it loads fine and takes the expected relatively small amount of memory. I think the solution is to turn on Xerces' grammar caching. The only danger there is that different DTDs within the same content set can different expansions for the same external parameter entity reference (e.g., MathML DTDs), which then can lead to validation issues. For this reason the DITA OT makes use of the grammar cache switchable but on by default. Another option for DITA content in particular is to use the OT's preprocessing to parse all the docs and then use BaseX with the parsed docs where all the attributes have been expanded into the source. Cheers, E. -- Eliot Kimber http://contrext.com On 5/4/18, 9:52 AM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: Follow up--I tried giving BaseX the full 16GB of RAM and it still ultimately locked up with the memory meter showing 13GB. I'm thinking this must be some kind of memory leak. I tried importing the DITA Open Toolkit's documentation source and that worked fine with the max memory being about 2.5GB, but it's only about 250 topics. Cheers, E. -- Eliot Kimber http://contrext.com On 5/3/18, 4:59 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: In the context of trying to do fun things with DITA docs in BaseX I downloaded the latest BaseX (9.0.1) and tried creating a new database and loading docs into it using the BaseX GUI. This is on macOS 10.13.4 with 16GB of hardware RAM available. My corpus is about 4000 DITA topics totaling about 30MB on disk. They are all in a single directory (not my decision) if that matters. Using the "parse DTDs" option and default indexing options (no token or full text indexes) I'm finding that even with 12GB of RAM allocated to the JVM the memory usage during load will eventually go to 12GB, at which point the processing appears to stop (that is, whatever I set the max memory to, when it's reached, things stop but I only got out of memory errors when I had much lower settings, like the default 2GB). I'm currently running a test with 14GB allocated and it is continuing but it does go to 12GB occasionally (watching the memory display on the Add progress panel). No individual file is that big--the biggest is 150K and typical is 30K or smaller. I wouldn't expect BaseX to have this kind of memory problem so I'm wondering if maybe there's an issue with memory on macOS or with DITA documents in particular (the DITA DTDs are notoriously large)? Should I expect BaseX to be able to load this kind of corpus with 14GB of RAM? Cheers, E. -- Eliot Kimber http://contrext.com
Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI
Follow up--I tried giving BaseX the full 16GB of RAM and it still ultimately locked up with the memory meter showing 13GB. I'm thinking this must be some kind of memory leak. I tried importing the DITA Open Toolkit's documentation source and that worked fine with the max memory being about 2.5GB, but it's only about 250 topics. Cheers, E. -- Eliot Kimber http://contrext.com On 5/3/18, 4:59 PM, "Eliot Kimber" <basex-talk-boun...@mailman.uni-konstanz.de on behalf of ekim...@contrext.com> wrote: In the context of trying to do fun things with DITA docs in BaseX I downloaded the latest BaseX (9.0.1) and tried creating a new database and loading docs into it using the BaseX GUI. This is on macOS 10.13.4 with 16GB of hardware RAM available. My corpus is about 4000 DITA topics totaling about 30MB on disk. They are all in a single directory (not my decision) if that matters. Using the "parse DTDs" option and default indexing options (no token or full text indexes) I'm finding that even with 12GB of RAM allocated to the JVM the memory usage during load will eventually go to 12GB, at which point the processing appears to stop (that is, whatever I set the max memory to, when it's reached, things stop but I only got out of memory errors when I had much lower settings, like the default 2GB). I'm currently running a test with 14GB allocated and it is continuing but it does go to 12GB occasionally (watching the memory display on the Add progress panel). No individual file is that big--the biggest is 150K and typical is 30K or smaller. I wouldn't expect BaseX to have this kind of memory problem so I'm wondering if maybe there's an issue with memory on macOS or with DITA documents in particular (the DITA DTDs are notoriously large)? Should I expect BaseX to be able to load this kind of corpus with 14GB of RAM? Cheers, E. -- Eliot Kimber http://contrext.com
Re: [basex-talk] about special characters
That mangled string is the result of reading UTF-8 byte sequences as single-byte characters, e.g. ASCII or some Windows code page. How are you loading it into BaseX? It seems unlikely that BaseX-provided code would make this kind of basic mistake in reading text but it’s possible it applied the incorrect encoding for some reason. Cheers, Eliot -- Eliot Kimber http://contrext.com From: <basex-talk-boun...@mailman.uni-konstanz.de> on behalf of BitRider001 <bit.rider@pm.me> Reply-To: BitRider001 <bit.rider@pm.me> Date: Thursday, May 17, 2018 at 8:34 PM To: Bridger Dyson-Smith <bdysonsm...@gmail.com> Cc: "basex-talk@mailman.uni-konstanz.de" <basex-talk@mailman.uni-konstanz.de> Subject: Re: [basex-talk] about special characters Bridger, Indeed the file was exported from Excel in UTF-8 encoding. I've tried opening the CSV file using Notepad/Wordpad and in Linux with vi in a terminal and in both situations it displays the correct special character. Its only when I load it into a BaseX db and query it does it show itself, as you said, as "mangled". Saving the results into a text file also contains the "mangled" string. Strange. Bit ‐‐‐ Original Message ‐‐‐ On May 18, 2018 9:21 AM, Bridger Dyson-Smith <bdysonsm...@gmail.com> wrote: Bit - that's odd; it looks like the characters are being decomposed (or whatever the term is) and mangled but I'm not sure, unfortunately. Was the CSV an export from Excel? If so, I suppose this could be a Windows character set problem (cp-1252 or iso-8859-1 or something?). Bridger On Thu, May 17, 2018 at 9:11 PM BitRider001 <bit.rider@pm.me> wrote: Hi Bridger, Yes that is right. I'm on the latest (9.0.1). Attaching a screenshot here for anyone to take a look. Bit ‐‐‐ Original Message ‐‐‐ On May 18, 2018 8:41 AM, Bridger Dyson-Smith <bdysonsm...@gmail.com> wrote: Hi Bit - are you using the latest version? There was a problem with 9.0 and some Unicode characters. Christian and co. have a fix in v9.0.1. HTH, Bridger On Thu, May 17, 2018, 7:54 PM BitRider001 <bit.rider@pm.me> wrote: Hi, I just joined the mailing list due to a problem I'm having displaying and storing special characters. I started with a CSV and created a database from it and the CSV is in UTF-8. However, when I query the special characters become garbled. I'm using the GUI in Windows 10. It starts with this in the CSV: Cañelas Then ends up with this when I export the query result into a text file: Ca�las Help please. Bit
[basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI
In the context of trying to do fun things with DITA docs in BaseX I downloaded the latest BaseX (9.0.1) and tried creating a new database and loading docs into it using the BaseX GUI. This is on macOS 10.13.4 with 16GB of hardware RAM available. My corpus is about 4000 DITA topics totaling about 30MB on disk. They are all in a single directory (not my decision) if that matters. Using the "parse DTDs" option and default indexing options (no token or full text indexes) I'm finding that even with 12GB of RAM allocated to the JVM the memory usage during load will eventually go to 12GB, at which point the processing appears to stop (that is, whatever I set the max memory to, when it's reached, things stop but I only got out of memory errors when I had much lower settings, like the default 2GB). I'm currently running a test with 14GB allocated and it is continuing but it does go to 12GB occasionally (watching the memory display on the Add progress panel). No individual file is that big--the biggest is 150K and typical is 30K or smaller. I wouldn't expect BaseX to have this kind of memory problem so I'm wondering if maybe there's an issue with memory on macOS or with DITA documents in particular (the DITA DTDs are notoriously large)? Should I expect BaseX to be able to load this kind of corpus with 14GB of RAM? Cheers, E. -- Eliot Kimber http://contrext.com