[basex-talk] Code highlighting not working on Wiki
Hi, Just discovered that the code samples on the basex wiki doesn't seem to be working fully. Noticed it a couple of days ago and thought it was temporary, but the problem is still there. Regards, Johan
Re: [basex-talk] Crash when starting basexgui
Running MacOS Sonoma Apple Silicon M2 processor and ran into the same problem when homebrew updated my JDK to 21.0.1 today. According to this article it seems like applications like jmeter and IntelliJ needs to fix this by implementing a specific interface. I guess this applies to Basexgui as well? https://stackoverflow.com/questions/77283578/sonoma-and-nsapplicationdelegate-applicationsupportssecurerestorablestate Regards, Johan
Re: [basex-talk] BaseX HTTP service goes down due to Qualys Agent
Is the agent calling the stop port? https://docs.basex.org/wiki/Options#STOPPORT On Mon, 3 Apr 2023 at 17:38, wrote: > >> "You mentioned that the Jetty server “goes down”. What does that mean? > Does it simply block any further requests? Do you have a 100% CPU workload?" > It doesn't accept any further requests. Just launching the basexhttp.bat > revives it. > > >> "Does Jetty stall if you disable all REST, RESTXQ, and/or WebDAV?" > We never tried to disable anything. > The Qualys Agent runs once every two weeks on a schedule. So, it is not > easy to run on demand for testing. > > >> " Which BaseX services are enabled in your web.xml?" > We never modified anything in the web.xml. Please see it below. > > >xmlns="http://xmlns.jcp.org/xml/ns/javaee; > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; > xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee > http://www.oracle.com/webfolder/technetwork/jsc/xml/ns/javaee/web-app_4_0.xsd; > > version="4.0"> > > BaseX: The XML Database and XQuery Processor > HTTP Services > > > > > > org.basex.http.SessionListener > > > org.basex.http.ServletListener > > > > > > > RESTXQ > org.basex.http.restxq.RestXqServlet > > org.basex.user > admin > > 1 > > > RESTXQ > /* > > > > > WebSocket > org.basex.http.ws.WsServlet > > > > WebSocket > /ws/* > > > > > REST > org.basex.http.rest.RESTServlet > > > REST > /rest/* > > > > > WebDAV > org.basex.http.webdav.WebDAVServlet > > > WebDAV > /webdav/* > > > > > default > > useFileMappedBuffer > false > > > > default > /static/* > > > > > > -Original Message- > From: Christian Grün > Sent: Monday, April 3, 2023 11:27 AM > To: ykhab...@bellsouth.net > Cc: BaseX > Subject: Re: [basex-talk] BaseX HTTP service goes down due to Qualys Agent > > The logs look inconspicuous indeed. Some more questions: > > • You mentioned that the Jetty server “goes down”. What does that mean? > Does it simply block any further requests? Do you have a 100% CPU workload? > • Which BaseX services are enabled in your web.xml? Does Jetty stall if > you disable all REST, RESTXQ, and/or WebDAV? > > Best, > Christian > > > > On Mon, Apr 3, 2023 at 4:44 PM wrote: > > > > Hi Christian, > > > > IMO, it is just the number of requests. > > I attached the .log file. > > > > -Original Message- > > From: Christian Grün > > Sent: Monday, April 3, 2023 10:32 AM > > To: ykhab...@bellsouth.net > > Cc: BaseX > > Subject: Re: [basex-talk] BaseX HTTP service goes down due to Qualys > > Agent > > > > Hi Yitzhak, > > > > have you checked the resulting log files in the data/.logs directory? > > Are there specific requests that take too much time, or is it the plain > number of incoming requests that eventually slows down the system? > > > > Best, > > Christian > > > > > > On Mon, Apr 3, 2023 at 4:29 PM wrote: > > > > > > Hello, > > > > > > > > > > > > We are using BaseX 10.5 via its HTTP service in a corporate > environment. > > > > > > > > > > > > We have an automated Qualys Agent that does a vulnerability scan of > that server with the BaseX. > > > > > > Qualys Agent scan process includes web sites related tests such as > Cross-Site Scripting, SQL Injection, etc. > > > > > > The rapid nature of the Qualys Agent requests effectively gives us a > DoS attack on the eclipse.jetty.server. > > > > > > It cannot process so many requests and goes down. > > > > > > > > > > > > In the meantime, our solution is to restart BaseX HTTP service > manually via basexhttp.bat. > > > > > > > > > > > > Question: is it possible to somehow configure the eclipse.jetty.server > so it will be able to sustain the Qualys Agent vulnerability scan? > > > > > > > > > > > > > > > > > > Regards, > > > Yitzhak Khabinsky > > > > > > > >
Re: [basex-talk] right way to create a db in the middle of processing?
Take a look at https://docs.basex.org/wiki/Commands#Command_Scripts To my knowledge, this is the way to do it. Regards Johan On Thu, Nov 24, 2022 at 4:53 PM Graydon Saunders wrote: > Hello -- > > So I've got a pattern where I want to: > > 1. perform some processing using proc:execute() on a directory tree of XML > files (easy) > 2. load the result of the processing (a parallel structure tree of XML > files) into a new db >(in principle, easy; db:create() does this) > 3. extract information from the newly created db and write that to a file > (easy) > 4. use proc:execute() to run different processing on the file written in > step 3 (easy, I think) > 5. load the result (thankfully a single file) and process it with a query > (easy, I can stuff that in a function) > > Ideally, this winds up as something invoked from a single query file as a > sequence of functions because its eventual fate is to be part of an > automated test that would ideally be a "run this one thing, look at the > boolean result". > > I hang up on step 2; so far as I can tell, there isn't a way to say "go > create a database and then make it usable to these other modules that are > guaranteed not to happen until db:create() has completed" but this seems > like such a common thing to want to do that I feel like I must be missing > something. > > Thanks! > Graydon >
Re: [basex-talk] Accessing Java constant from XQuery
In your case, i think, java:TITLE() would work. Just trying with java.lang.Integer i got the following code to work declare namespace integer="java:java.lang.Integer"; integer:MIN_VALUE() On Wed, Feb 16, 2022 at 2:55 PM Paul L. Merchant Jr. < paul.l.merchant...@dartmouth.edu> wrote: > Hi everyone, I'm using a Java library in XQuery in BaseX and I haven't > been able to determine if there's a way to access a constant string (final > static String... or even just final String...) in the Java class from > XQuery. > > For example, if the Java class is declared as > > package my.module; > > public class MyModule { > public static final String TITLE ="I'm a module!"; > > public String hello(final String world) { > return "Hello " + world; > } > } > > The XQuery to call the "hello" method is > > import module namespace java = 'java:my.module.MyModule'; > > java:hello("World!") > > > But is there any way to access "TITLE"? > > I'd like to share some constants between XQuery and Java, and I could > write accessor functions for these constants, but it'd be cleaner if I > didn't have to. > > Thanks! > > >
Re: [basex-talk] 9.6 on Docker Hub
Hi Christian, Any news on when we might see an official Docker image of the 9.6.x release on docker hub? Regards, Johan On Tue, Sep 21, 2021 at 4:15 PM Christian Grün wrote: > Hi Johannes, > > Thanks for the pointer. It’s quite likely that the existing Docker > image will be replaced by a revised version (the point of time has not > been settled yet). > > Cheers, > Christian > > > On Tue, Sep 21, 2021 at 9:35 AM Johannes Bauer > wrote: > > > > Hi, > > > > I cannot find the latest BaseX release on Docker Hub (only the 9.5 > branch). > > Is this still planned? > > > > Thank you. > > > > Best Regards > > Johannes > > >
Re: [basex-talk] Brew upgrade for 9.4.6 fails with SHA256 mismatch
Sure! Pull request sent. /Johan On Fri, Jan 8, 2021 at 1:21 PM Christian Grün wrote: > Oh, thanks. So that was caused by our 9.4.6 version update (which the > redundant library files removed). > > I remember you contributed BaseX homebrew files in the past. Would you > mind sending a homebrew pull request with the updated SHA256 value for > us? That would be great! > > > > > On Fri, Jan 8, 2021 at 12:52 PM Johan Mörén wrote: > > > > Hi Christian, > > > > It's a property stated in the basex.rb formula file. > > > > > https://github.com/Homebrew/homebrew-core/blob/7da7fb15bfd1301a023b97ec91945c1985278edf/Formula/basex.rb#L6 > > > > /Johan > > > > On Fri, Jan 8, 2021 at 12:45 PM Christian Grün < > christian.gr...@gmail.com> wrote: > >> > >> Hi Johan, > >> > >> Thanks for the observation. > >> > >> > Expected: > c6dd9ac56e72de19c88152c76a6e4052deccd4f0db7a4b072560b92318a5 > >> > >> Do you know where this “expected” SHA256 value comes from? > >> > >> Cheers, > >> Christian >
Re: [basex-talk] Brew upgrade for 9.4.6 fails with SHA256 mismatch
Hi Christian, It's a property stated in the basex.rb formula file. https://github.com/Homebrew/homebrew-core/blob/7da7fb15bfd1301a023b97ec91945c1985278edf/Formula/basex.rb#L6 /Johan On Fri, Jan 8, 2021 at 12:45 PM Christian Grün wrote: > Hi Johan, > > Thanks for the observation. > > > Expected: > c6dd9ac56e72de19c88152c76a6e4052deccd4f0db7a4b072560b92318a5 > > Do you know where this “expected” SHA256 value comes from? > > Cheers, > Christian >
[basex-talk] Brew upgrade for 9.4.6 fails with SHA256 mismatch
Just wanted to let you know that the latest BaseX version can't be installed or upgraded via homebrew. I get the following error message: brew upgrade basex ==> Upgrading 1 outdated package: basex 9.4.5 -> 9.4.6 ==> Upgrading basex 9.4.5 -> 9.4.6 ==> Downloading https://files.basex.org/releases/9.4.6/BaseX946.zip 100.0% Error: SHA256 mismatch Expected: c6dd9ac56e72de19c88152c76a6e4052deccd4f0db7a4b072560b92318a5 Actual: 96012025d749062540f9b16f830d7b99876696b66534da2fe27c833d1b1fc43b File: /Users/johmor/Library/Caches/Homebrew/downloads/befef7ec63e74dbf695bd24e06e16773055a9fb0c7f69d9d7e9fd1787766a377--BaseX946.zip To retry an incomplete download, remove the file above. Regards, Johan
Re: [basex-talk] Xquery recursion and db:add() - stack overflow
Hi Lars I have done some OAI-PMH fetches but never got into stack-overflow issues. I guess one workaround you can do on your part is to partition your query with date-ranges using the query parameters "from" and "until" on your initial call to the endpoint. Regards, Johan Mörén On Wed, May 11, 2016 at 5:07 PM Lars Johnsen <yoon...@gmail.com> wrote: > The basexgui startup file now contains: > > BASEX_JVM="-Xmx8g -Xss4m $BASEX_JVM" > > It helped the script a long way, but eventually it had to kneel. It works > fine though, on smaller datasets. > > Maybe there is some other way to get the data over. I'll have a talk with > the guys providing the OAI-endpoint. > > Thanks for the pointer to Xss! > > Lars > > 2016-05-11 14:38 GMT+02:00 Dirk Kirsten <d...@basex.org>: > >> Hello Lars, >> >> if you have a deep recursion Java will at some point hit its stack size >> limit. Have you already tried to simply increase the Java stack size, e.g. >> by passing the parameter -Xss2m to the JVM? >> >> Cheers >> Dirk >> >> >> On 05/11/2016 01:43 PM, Lars Johnsen wrote: >> >> The following code generates the error "Stack Overflow: try tail >> recursion?" >> >> The code reads in bibliographic data using OAI-PMH and updates a database >> for each chunk of data. With OAI-PMH, only part of the data is available >> for each request, so the server returns a resumption token if there are >> more data available. >> >> The xquery function making the queries is implemented recursively >> preceded by a database update request (see the last two lines) for each >> call. Is it db:add() that causes the stack overflow? The recursion cannot >> be placed further towards the end! >> >> declare %updating function local:getResumption($token) { >> if (empty($token)) then >> () >> else >> let $http-request := http:send-request($http-option, $URL || >> $token) >> let $result := >> if ($http-request instance of node()) then >> $http-request >>else >> {$http-request} >> >> let $resume := $result//oai:resumptionToken/text() >> return ( >> db:add($database,element chunk {$result//oai:metadata}, >> $path) , >> local:getResumption($resume) >> ) >> }; >> >> Best, >> Lars >> >> >> -- >> Dirk Kirsten, BaseX GmbH, http://basexgmbh.de >> |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz >> |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: >> | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle >> `-- Phone: 0049 7531 91 68 276, Fax: 0049 7531 20 05 22 >> >> >
Re: [basex-talk] XSD 1.1
Nice work! We use a lot of Schema 1.1 at my shop. /Johan On Thu, Sep 17, 2015 at 1:08 PM Christian Grünwrote: > Hi Alex (cc to the list), > > Your feedback was very helpful. And I've finally learnt more about XML > Schema 1.1 validation, which is now available in the new snapshot [1]. > I have also completely revised the documentation on our Validation > Module [2]. > > Looking forward to your feedback on your first tests, > Christian > > [1] http://files.basex.org/releases/latest/ > [2] http://docs.basex.org/wiki/Validation_Module#XML_Schema_Validation >
Re: [basex-talk] Multiple server instances on a single computer?
I have noticed that when using two different versions of BaseX in embedded mode, in two different applications on the same application server. Errors are logged about rewriting the .basex file. No exceptions are throw, just the message in the log appears. Is there a way to create a QueryProcessor without the need for these files to be created? If not, i think it would be a nice feature to add. Regards, Johan Mörén On Mon, May 25, 2015 at 8:50 PM France Baril france.ba...@architextus.com wrote: It just works! Thank you!
Re: [basex-talk] Basex xquery import module classpath lookup
Support for this would be really great if it does not exist yet. I'm running a lot of queries in embeded mode and would like to extract some functions to modules that multiple queries can share to reduce code duplication. Making it possible to give the query processor a customizable URIResolver would be great since we package our resources in .jar files on deployment. Regards, Johan Mörén On Thu, Apr 9, 2015 at 6:13 PM, Karel Hovorka karel.hovo...@semantico.com wrote: We always use custom URIResolver when working with classpath resources afaik. Similar mechanism might be better than original feature for reasons you stated (duplicity in classpath). You can solve this by choosing correct classloader in *URIResolver. I've been digging a little into BaseX source and I've found org.basex.io. IO.get method, which does similar thing (loading resource based on string). If this factory method replaces by a customizable factory with default implementation, that might solve the problem. However I know very little about BaseX, and I may be completely wrong here. Karel On 09/04/2015 16:47, Christian Grün wrote: Thanks for the links. If I get it right, you need to explicitly call the URI resolver setter functions to get it working? Or does it mean that modules will automatically be available via import module if they are in the Java classpath (which is, if I got it right, the feature you asked for)? On Thu, Apr 9, 2015 at 5:02 PM, Karel Hovorkakarel.hovo...@semantico.com karel.hovo...@semantico.com wrote: Exactly, I am refering to import module. In my experience with xml-related technologies and java, library usually supports customizable class for import resource lookup based on namespaces, paths etc, I can recall 2 examples: Saxon has class ModuleURIResolver that does just this:http://www.saxonica.com/html/documentation/javadoc/net/sf/saxon/s9api/XQueryCompiler.html#setModuleURIResolver%28net.sf.saxon.lib.ModuleURIResolver%29 Similar class URIResolver and mechanism is in java core for xsl:import/xsl:include lookup:http://docs.oracle.com/javase/7/docs/api/javax/xml/transform/TransformerFactory.html#setURIResolver%28javax.xml.transform.URIResolver%29 Thanks, Karel On 09/04/2015 15:41, Christian Grün wrote: I would like to load xquery files that are on classpath of my java application, not only filesystem. Does this mean, it is not possible with BaseX? Do you refer to the import module when talking about loading xquery files? The classpath is not considered when importing modules. As multiple directories can be specified in the classpath, it could even happen that different modules would be import candidates, so we'd need additional precedence rules. -- Is this possibly supported by any other query processor you have tried? Best, Christian Karel On 08/04/2015 18:01, Christian Grün wrote: Hi Karel, In BaseX, modules either need to be located in the module repository, or the relative path needs to be specified in the import module statement. Some more information on locating modules can be found in our Wiki [1]. Hope this helps, Christian [1] http://docs.basex.org/wiki/Repository On Wed, Apr 8, 2015 at 5:56 PM, Karel Hovorkakarel.hovo...@semantico.com karel.hovo...@semantico.com wrote: Hi, I am trying to run xquery file on basex, but it depends on xquery module located in classpath. I have found parameter org.basex.QUERYPATH, that configures module lookup on filesystem. Is there way to lookup modules in classpath? Thanks, Karel Hovorka, Semantico LTD
Re: [basex-talk] BaseX 8.0.1: Minor Patches
Hi! Just a reminder to update the homebrew distribution as well! Regards, Johan Mörén On Sun, Feb 22, 2015 at 10:07 PM Christian Grün christian.gr...@gmail.com wrote: Hi there, We have just released a first post-Prague version of BaseX, which includes some minor optimizations and fixes (nothing critical): XQUERY - Faster execution of single, index-based results - Iterative evaluation of steps with multiple predicates FIXES - WebDAV locking - Archive Module - Adaptive serialization of arrays and maps - Digest Authentication - Save command-line history In future, we will try our best to release bug fixes more regularly. This way, you can stay up-to-date without the need to check for new versions five times a day. Have fun, Christian [1] http://basex.org/downloads
Re: [basex-talk] Feature request
Hello! Just wanted to report back that it works really well. It is about 50% slower than running the md5 command on the command line of my mac. A 4.15 gb file takes around 20 seconds in BaseX compared to 10 seconds using the native command. Not sure if this is a limitation in Java or if performance could be tweaked further. But at the moment it feels unimportant for our case. Thank you again for your swift reply and delivery! Regards, Johan Mörén On Sun Jan 25 2015 at 1:56:21 PM Johan Mörén johan.mo...@gmail.com wrote: Great news Christian. I'll try it out tomorrow at work! /Johan On Sun, Jan 25, 2015 at 1:22 PM, Christian Grün christian.gr...@gmail.com wrote: Hi Johan, A new snapshot is available [1]. In the course of rewriting the hashing code, I further improved our streamlining architecture [2, 3]. Your testing feedback is welcome, Christian [1] http://files.basex.org/releases/latest/ [2] https://github.com/BaseXdb/basex/commit/b39b7 [3] https://github.com/BaseXdb/basex/commit/28139 On Sat, Jan 24, 2015 at 8:39 PM, Christian Grün christian.gr...@gmail.com wrote: Thanks, this makes it much easier. I'll probably go for this one: MessageDigest md = MessageDigest.getInstance(algo); try(InputStream is = ...) { try(DigestInputStream dis = new DigestInputStream(is, md)) { while(dis.read() != -1); } return md.digest(); } Keeping you updated, Christian On Sat, Jan 24, 2015 at 7:39 PM, Johan Mörén johan.mo...@gmail.com wrote: Hi Christian I think you can go with Javas implementation all the way. like this MessageDigest md = MessageDigest.getInstance(MD5); InputStream is = new FileInputStream(C:\\Temp\\Small\\Movie.mp4); // Size 700 MB byte [] buffer = new byte [blockSize]; int numRead; do { numRead = is.read(buffer); if (numRead 0) { md.update(buffer, 0, numRead); } } while (numRead != -1); byte[] digest = md.digest(); On Sat Jan 24 2015 at 6:49:18 PM Christian Grün christian.gr...@gmail.com wrote: Hi Johan, looks like a useful feature! Currently, we use Java's default implementation for computing hashes [1]. If you want to help us, you could look out for an existing Java md5 hashing source code, which we could then adopt in BaseX! Best, Christian [1] https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/query/func/hash/HashFn.java On Sat, Jan 24, 2015 at 11:37 AM, Johan Mörén johan.mo...@gmail.com wrote: Hello! We have been using the hashing module to calculate md5 checksums on binary files successfully for a while. But last week we received our first really large file (4.3 gb) and our script threw a java.lang.OutOfMemoryError: Requested array size exceeds VM limit We are currently using the 7.8 version of BaseX. I suspect that BaseX materialize the stream returned by file:read-binary as a byte-array when we call the hash:md5 function. This is a snippet of our script where the problem arises ... let $binary := file:read-binary($filePath) let $checksum := lower-case(xs:string(xs:hexBinary(hash:md5($binary ... I think a nice feature to add to BaseX could either be a new function in the file-module called file-checksum($algorithm) that calculates checksum on files in a streaming fashion. Or perhaps an option to the hashing functions that indicates that you want them to use streaming. Regards, Johan Mörén
Re: [basex-talk] Feature request
We run the script that uses this functionality embedded in a java application. I noticed now that the first time the code runs after a cold start. This log message appears. java.nio.file.FileSystemNotFoundException: Provider wrap not installed at java.nio.file.Paths.get(Paths.java:147) at org.basex.util.Prop.homePath(Prop.java:142) at org.basex.util.Prop.clinit(Prop.java:96) at org.basex.core.StaticOptions.clinit(StaticOptions.java:20) at org.basex.core.Context.init(Context.java:77) at org.basex.core.Context.init(Context.java:69) at se.kb.mimer.util.xquery.XQueryClient.extractPackageFilesData(XQueryClient.java:16) The script still produces the expected output. I guess that this is a handled exception inside BaseX that get printed out to the log with INFO level. Am i right? Regards, Johan On Mon Jan 26 2015 at 2:24:41 PM Christian Grün christian.gr...@gmail.com wrote: Hi Johan, Just wanted to report back that it works really well. Glad to hear it works. It is about 50% slower than running the md5 command on the command line of my mac. My final solution is close to the one you proposed [1]: I decided to use a little buffer as well, because it was faster than calling md.update() for each single byte. Using nio channels gives us better performance: String path = ... RandomAccessFile raf = new RandomAccessFile(path, r); FileChannel ch = raf.getChannel(); ByteBuffer buf = ByteBuffer.allocate(IO.BLOCKSIZE); final MessageDigest md = MessageDigest.getInstance(md5); do { final int n = ch.read(buf); if(n == -1) break; md.update(buf.array(), 0, n); buf.flip(); } while(true); System.out.println(Token.string(Token.hex(md.digest(), true))); But I am not sure how smoothly this would integrate in our remaining streaming architecture, as we are also streaming main-memory objects. I'll keep it in mind, though. Cheers, Christian [1] https://github.com/BaseXdb/basex/blob/master/basex-core/ src/main/java/org/basex/query/func/hash/HashFn.java
Re: [basex-talk] Feature request
Great news Christian. I'll try it out tomorrow at work! /Johan On Sun, Jan 25, 2015 at 1:22 PM, Christian Grün christian.gr...@gmail.com wrote: Hi Johan, A new snapshot is available [1]. In the course of rewriting the hashing code, I further improved our streamlining architecture [2, 3]. Your testing feedback is welcome, Christian [1] http://files.basex.org/releases/latest/ [2] https://github.com/BaseXdb/basex/commit/b39b7 [3] https://github.com/BaseXdb/basex/commit/28139 On Sat, Jan 24, 2015 at 8:39 PM, Christian Grün christian.gr...@gmail.com wrote: Thanks, this makes it much easier. I'll probably go for this one: MessageDigest md = MessageDigest.getInstance(algo); try(InputStream is = ...) { try(DigestInputStream dis = new DigestInputStream(is, md)) { while(dis.read() != -1); } return md.digest(); } Keeping you updated, Christian On Sat, Jan 24, 2015 at 7:39 PM, Johan Mörén johan.mo...@gmail.com wrote: Hi Christian I think you can go with Javas implementation all the way. like this MessageDigest md = MessageDigest.getInstance(MD5); InputStream is = new FileInputStream(C:\\Temp\\Small\\Movie.mp4); // Size 700 MB byte [] buffer = new byte [blockSize]; int numRead; do { numRead = is.read(buffer); if (numRead 0) { md.update(buffer, 0, numRead); } } while (numRead != -1); byte[] digest = md.digest(); On Sat Jan 24 2015 at 6:49:18 PM Christian Grün christian.gr...@gmail.com wrote: Hi Johan, looks like a useful feature! Currently, we use Java's default implementation for computing hashes [1]. If you want to help us, you could look out for an existing Java md5 hashing source code, which we could then adopt in BaseX! Best, Christian [1] https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/query/func/hash/HashFn.java On Sat, Jan 24, 2015 at 11:37 AM, Johan Mörén johan.mo...@gmail.com wrote: Hello! We have been using the hashing module to calculate md5 checksums on binary files successfully for a while. But last week we received our first really large file (4.3 gb) and our script threw a java.lang.OutOfMemoryError: Requested array size exceeds VM limit We are currently using the 7.8 version of BaseX. I suspect that BaseX materialize the stream returned by file:read-binary as a byte-array when we call the hash:md5 function. This is a snippet of our script where the problem arises ... let $binary := file:read-binary($filePath) let $checksum := lower-case(xs:string(xs:hexBinary(hash:md5($binary ... I think a nice feature to add to BaseX could either be a new function in the file-module called file-checksum($algorithm) that calculates checksum on files in a streaming fashion. Or perhaps an option to the hashing functions that indicates that you want them to use streaming. Regards, Johan Mörén
Re: [basex-talk] Feature request
Great to hear Christian! You guys respond really fast :) /Johan On Sat Jan 24 2015 at 8:40:04 PM Christian Grün christian.gr...@gmail.com wrote: Thanks, this makes it much easier. I'll probably go for this one: MessageDigest md = MessageDigest.getInstance(algo); try(InputStream is = ...) { try(DigestInputStream dis = new DigestInputStream(is, md)) { while(dis.read() != -1); } return md.digest(); } Keeping you updated, Christian On Sat, Jan 24, 2015 at 7:39 PM, Johan Mörén johan.mo...@gmail.com wrote: Hi Christian I think you can go with Javas implementation all the way. like this MessageDigest md = MessageDigest.getInstance(MD5); InputStream is = new FileInputStream(C:\\Temp\\Small\\Movie.mp4); // Size 700 MB byte [] buffer = new byte [blockSize]; int numRead; do { numRead = is.read(buffer); if (numRead 0) { md.update(buffer, 0, numRead); } } while (numRead != -1); byte[] digest = md.digest(); On Sat Jan 24 2015 at 6:49:18 PM Christian Grün christian.gr...@gmail.com wrote: Hi Johan, looks like a useful feature! Currently, we use Java's default implementation for computing hashes [1]. If you want to help us, you could look out for an existing Java md5 hashing source code, which we could then adopt in BaseX! Best, Christian [1] https://github.com/BaseXdb/basex/blob/master/basex-core/ src/main/java/org/basex/query/func/hash/HashFn.java On Sat, Jan 24, 2015 at 11:37 AM, Johan Mörén johan.mo...@gmail.com wrote: Hello! We have been using the hashing module to calculate md5 checksums on binary files successfully for a while. But last week we received our first really large file (4.3 gb) and our script threw a java.lang.OutOfMemoryError: Requested array size exceeds VM limit We are currently using the 7.8 version of BaseX. I suspect that BaseX materialize the stream returned by file:read-binary as a byte-array when we call the hash:md5 function. This is a snippet of our script where the problem arises ... let $binary := file:read-binary($filePath) let $checksum := lower-case(xs:string(xs: hexBinary(hash:md5($binary ... I think a nice feature to add to BaseX could either be a new function in the file-module called file-checksum($algorithm) that calculates checksum on files in a streaming fashion. Or perhaps an option to the hashing functions that indicates that you want them to use streaming. Regards, Johan Mörén
[basex-talk] Feature request
Hello! We have been using the hashing module to calculate md5 checksums on binary files successfully for a while. But last week we received our first really large file (4.3 gb) and our script threw a *java.lang.OutOfMemoryError: Requested array size exceeds VM limit* We are currently using the 7.8 version of BaseX. I suspect that BaseX materialize the stream returned by file:read-binary as a byte-array when we call the hash:md5 function. This is a snippet of our script where the problem arises ... let $binary := file:read-binary($filePath) let $checksum := lower-case(xs:string(xs:hexBinary(hash:md5($binary ... I think a nice feature to add to BaseX could either be a new function in the file-module called file-checksum($algorithm) that calculates checksum on files in a streaming fashion. Or perhaps an option to the hashing functions that indicates that you want them to use streaming. Regards, Johan Mörén
Re: [basex-talk] Http-module error response-bodies always returned as string
Hi Christian No it's another container, a Camel route on a Karaf instance exposed as a REST-service. I'm pretty sure this is due to the handling of error in HttpPayload.java I'm still on BaseX 7.9 if that matters. But when i checked the code i peeked into the master branch. Regards, Johan Mörén On Tue Nov 25 2014 at 6:44:28 PM Christian Grün christian.gr...@gmail.com wrote: Hi Johan, Thanks for bringing this up. I currenly try to find out if this is a server-side or client-side issue. Is the web service you are talking to (on port 9595) a BaseX instance, too? Thanks in advance, Christian On Tue, Nov 25, 2014 at 8:49 AM, Johan Mörén johan.mo...@gmail.com wrote: Hi I have a REST service that communicates details about any errors that occurs during execution via a an XML-payload in the response body. I have noticed that the http-module ignores the content-type header of the response if the status-code is 399. Then it is flagged as being in error and in HttpPayload this line forces the content type of the body to text/plain HttpPayload.java // error: use text/plain as content type final String ct = error ? TEXT_PLAIN : utype != null ? utype : contentType(ctype); Overriding the content type via the override-media-type attribute of http:request doesn't help either in this case. Is there any special reason that error response-bodies are handled in this way? To my knowledge it is very common that REST services provides error information by sending it in the response body. I guess that changing the default behaviour might break existing implementations. But perhaps an additional attribute in the http-request could flag that the media-type of the content-type should be used even with error-responses. Or perhaps via an additional option in the http:send-request() function. Exampel of request and response where this error occurs. Request: http:send-request(http:request override-media-type=application/xml href={'http://localhost:9595/sequencegenerator/objectInstanceId/' || '0'} method=get/) Response: http:response xmlns:http=http://expath.org/ns/http-client; status=400 message=Bad Request http:header name=nbrOfIds value=0/ http:header name=typeOfId value=objectInstanceId/ http:header name=Transfer-Encoding value=chunked/ http:header name=breadcrumbId value=ID-2013M-2-local-61198-1416819131884-37-54/ http:header name=User-Agent value=Java/1.7.0_45/ http:header name=Accept value=text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2/ http:header name=Content-Type value=application/xml/ http:header name=Server value=Jetty(8.1.15.v20140411)/ http:body media-type=text/plain/ /http:responselt;faultResponsegt;lt;errorMessagegt;number of ids must be a positive integer larger than 0lt;/errorMessagegt;lt;/faultResponsegt; Regards, Johan Mörén
Re: [basex-talk] Http-module error response-bodies always returned as string
Thanks Christian! I will try out the code tomorrow at work! Thanks for the fast feedback! Regards, Johan On Tue Nov 25 2014 at 7:58:26 PM Christian Grün christian.gr...@gmail.com wrote: Hi Johan, I modified both the server- and client-side HTTP code to make it more erwartungskonform. Could you possibly check out the latest stable snapshot [1] and give me feedback? Thanks, Christian [1] http://files.basex.org/releases/latest/ On Tue, Nov 25, 2014 at 7:01 PM, Johan Mörén johan.mo...@gmail.com wrote: Hi Christian No it's another container, a Camel route on a Karaf instance exposed as a REST-service. I'm pretty sure this is due to the handling of error in HttpPayload.java I'm still on BaseX 7.9 if that matters. But when i checked the code i peeked into the master branch. Regards, Johan Mörén On Tue Nov 25 2014 at 6:44:28 PM Christian Grün christian.gr...@gmail.com wrote: Hi Johan, Thanks for bringing this up. I currenly try to find out if this is a server-side or client-side issue. Is the web service you are talking to (on port 9595) a BaseX instance, too? Thanks in advance, Christian On Tue, Nov 25, 2014 at 8:49 AM, Johan Mörén johan.mo...@gmail.com wrote: Hi I have a REST service that communicates details about any errors that occurs during execution via a an XML-payload in the response body. I have noticed that the http-module ignores the content-type header of the response if the status-code is 399. Then it is flagged as being in error and in HttpPayload this line forces the content type of the body to text/plain HttpPayload.java // error: use text/plain as content type final String ct = error ? TEXT_PLAIN : utype != null ? utype : contentType(ctype); Overriding the content type via the override-media-type attribute of http:request doesn't help either in this case. Is there any special reason that error response-bodies are handled in this way? To my knowledge it is very common that REST services provides error information by sending it in the response body. I guess that changing the default behaviour might break existing implementations. But perhaps an additional attribute in the http-request could flag that the media-type of the content-type should be used even with error-responses. Or perhaps via an additional option in the http:send-request() function. Exampel of request and response where this error occurs. Request: http:send-request(http:request override-media-type=application/xml href={'http://localhost:9595/sequencegenerator/objectInstanceId/' || '0'} method=get/) Response: http:response xmlns:http=http://expath.org/ns/http-client; status=400 message=Bad Request http:header name=nbrOfIds value=0/ http:header name=typeOfId value=objectInstanceId/ http:header name=Transfer-Encoding value=chunked/ http:header name=breadcrumbId value=ID-2013M-2-local-61198-1416819131884-37-54/ http:header name=User-Agent value=Java/1.7.0_45/ http:header name=Accept value=text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2/ http:header name=Content-Type value=application/xml/ http:header name=Server value=Jetty(8.1.15.v20140411)/ http:body media-type=text/plain/ /http:responselt;faultResponsegt;lt;errorMessagegt;number of ids must be a positive integer larger than 0lt;/errorMessagegt;lt;/faultResponsegt; Regards, Johan Mörén
[basex-talk] Http-module error response-bodies always returned as string
Hi I have a REST service that communicates details about any errors that occurs during execution via a an XML-payload in the response body. I have noticed that the http-module ignores the content-type header of the response if the status-code is 399. Then it is flagged as being in error and in HttpPayload this line forces the content type of the body to text/plain HttpPayload.java // error: use text/plain as content type final String ct = error ? TEXT_PLAIN : utype != null ? utype : contentType(ctype); Overriding the content type via the override-media-type attribute of http:request doesn't help either in this case. Is there any special reason that error response-bodies are handled in this way? To my knowledge it is very common that REST services provides error information by sending it in the response body. I guess that changing the default behaviour might break existing implementations. But perhaps an additional attribute in the http-request could flag that the media-type of the content-type should be used even with error-responses. Or perhaps via an additional option in the http:send-request() function. Exampel of request and response where this error occurs. Request: http:send-request(http:request override-media-type=application/xml href={'http://localhost:9595/sequencegenerator/objectInstanceId/' || '0'} method=get/) Response: http:response xmlns:http=http://expath.org/ns/http-client; status=400 message=Bad Request http:header name=nbrOfIds value=0/ http:header name=typeOfId value=objectInstanceId/ http:header name=Transfer-Encoding value=chunked/ http:header name=breadcrumbId value=ID-2013M-2-local-61198-1416819131884-37-54/ http:header name=User-Agent value=Java/1.7.0_45/ http:header name=Accept value=text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2/ http:header name=Content-Type value=application/xml/ http:header name=Server value=Jetty(8.1.15.v20140411)/ http:body media-type=text/plain/ /http:responselt;faultResponsegt;lt;errorMessagegt;number of ids must be a positive integer larger than 0lt;/errorMessagegt;lt;/faultResponsegt; Regards, Johan Mörén
Re: [basex-talk] 7.8.2 application on Mac OS X
Any plans to continue releasing it via homebrew? Regards, Johan On Thu, Apr 24, 2014 at 10:07 AM, Michael Seiferle m...@basex.org wrote: Hi James, my bad! I forgot to fix the 7.8.2 bundle; the „latest“ bundle should already work for you: http://files.basex.org/releases/BaseX-latest.app.zip I’ll try to update the 7.8.2 bundle this evening! I’ll let you know once this happened. Best Michael Am 23.04.2014 um 15:33 schrieb James Ball basex-t...@jamesball.co.uk: Hello, Is anyone else having problems with the BaseX 7.8.2.app.zip file on Mac OS X (downloaded from the BaseX site)? When I unzip the file I get a no entry sign over the application icon and launching the application gets an ‘You can’t open the application BaseX782.app” because it may be damaged or incomplete.’ I think the name of the application should be BaseX.app rather than BaseX782.app so I think the packaging has gone wrong somewhere. Inside the package is a file called EMPTY and a BaseX.app - but this doesn’t launch for me either. Many thanks, James - James Ball
Re: [basex-talk] 7.8.2 application on Mac OS X
Thanks! /Johan On Thu, Apr 24, 2014 at 11:01 PM, Arve Gengelbach a...@basex.org wrote: Hi, Indeed it is easy. Just added a pull request to homebrew: https://github.com/Homebrew/homebrew/pull/28686 The new version should be available soon via brew. cheers Arve On 24 Apr 2014, at 17:35, Dirk Kirsten d...@basex.org wrote: Hello Johan, sure, I think so. Jens normally does this, but is currently on vacation. Michael just said he is going to do this as soon as he has some time, so hang in there. If you like, you can also do it yourself, I guess. Looking at the commit it seems it is really easy to do: https://github.com/JensErat/homebrew/commit/4d4fb91fa25ba7ff5ef9ec814faf1d08323661bb I don't have MacOS, so contribution is very welcome here. Cheers, Dirk On 24/04/14 17:22, Johan Mörén wrote: Any plans to continue releasing it via homebrew? Regards, Johan On Thu, Apr 24, 2014 at 10:07 AM, Michael Seiferle m...@basex.org wrote: Hi James, my bad! I forgot to fix the 7.8.2 bundle; the „latest“ bundle should already work for you: http://files.basex.org/releases/BaseX-latest.app.zip I’ll try to update the 7.8.2 bundle this evening! I’ll let you know once this happened. Best Michael Am 23.04.2014 um 15:33 schrieb James Ball basex-t...@jamesball.co.uk : Hello, Is anyone else having problems with the BaseX 7.8.2.app.zip file on Mac OS X (downloaded from the BaseX site)? When I unzip the file I get a no entry sign over the application icon and launching the application gets an ‘You can’t open the application BaseX782.app” because it may be damaged or incomplete.’ I think the name of the application should be BaseX.app rather than BaseX782.app so I think the packaging has gone wrong somewhere. Inside the package is a file called EMPTY and a BaseX.app - but this doesn’t launch for me either. Many thanks, James - James Ball -- Dirk Kirsten, BaseX GmbH, http://basex.org |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
[basex-talk] stream binary responses from http:module
Hi I have built a small client in XQuery that fetches zip files from another remote service using the http-module and stores them on disk with the help of the file-module. Some of the returned zip files are really large and BaseX seems to need to materialize the files in memory before writing them to disk resulting in out of memory errors for some files. Is there some way read the response in a more streamable fashion when using the http-module? I tried using the fetch:binary function successfully but i really need the more extended functionality of the http-module. #results in out of memory if the zip file is too large file:write-binary(file.zip, http:sendrequest(...)[2]) #works all the time. file:write-binary(file.zip, fetch:binary(...)) Regards, Johan ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] stream binary responses from http:module
Hi Christian The options i need from the http-module are at the moment mainly the ability to authenticate. Getting hold of response/request headers and status code are also very useful if you want to do some more detailed error handling. Extending the HTTP-module sounds like my favourite as well. /Johan On Thu, Mar 27, 2014 at 1:57 PM, Christian Grün christian.gr...@gmail.comwrote: Hi Johan, the HTTP Module is pretty magic, because it automatically tries to convert the input to the expected result format. This makes it difficult to stream. We have already pondered two options to circumvent this restriction: * Fetch Module: extend the function signatures with additional options * HTTP Module (our favorite): add additional functions (e.g. http:get(), http:post(), etc.) with xs:base64 as return type. Which options of the HTTP Module do you currently use? Christian __ On Thu, Mar 27, 2014 at 1:06 PM, Johan Mörén hutchkint...@gmail.com wrote: Hi I have built a small client in XQuery that fetches zip files from another remote service using the http-module and stores them on disk with the help of the file-module. Some of the returned zip files are really large and BaseX seems to need to materialize the files in memory before writing them to disk resulting in out of memory errors for some files. Is there some way read the response in a more streamable fashion when using the http-module? I tried using the fetch:binary function successfully but i really need the more extended functionality of the http-module. #results in out of memory if the zip file is too large file:write-binary(file.zip, http:sendrequest(...)[2]) #works all the time. file:write-binary(file.zip, fetch:binary(...)) Regards, Johan ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] stream binary responses from http:module
Hi Christian Was aware of that possibility but it only solves one of the problems. Some way of streaming a large response as text or binary with the http-module would be desirable. Having control of the http-headers of the request and response is getting more an more important when interacting with REST-services. /Johan On Thu, Mar 27, 2014 at 4:39 PM, Christian Grün christian.gr...@gmail.comwrote: Hi Johan, that's something you may know anyway, but... you can as well specify the authentication data in your URL: http://name:password@... Hope this helps, Christian On Thu, Mar 27, 2014 at 3:37 PM, Johan Mörén hutchkint...@gmail.com wrote: Hi Christian The options i need from the http-module are at the moment mainly the ability to authenticate. Getting hold of response/request headers and status code are also very useful if you want to do some more detailed error handling. Extending the HTTP-module sounds like my favourite as well. /Johan On Thu, Mar 27, 2014 at 1:57 PM, Christian Grün christian.gr...@gmail.com wrote: Hi Johan, the HTTP Module is pretty magic, because it automatically tries to convert the input to the expected result format. This makes it difficult to stream. We have already pondered two options to circumvent this restriction: * Fetch Module: extend the function signatures with additional options * HTTP Module (our favorite): add additional functions (e.g. http:get(), http:post(), etc.) with xs:base64 as return type. Which options of the HTTP Module do you currently use? Christian __ On Thu, Mar 27, 2014 at 1:06 PM, Johan Mörén hutchkint...@gmail.com wrote: Hi I have built a small client in XQuery that fetches zip files from another remote service using the http-module and stores them on disk with the help of the file-module. Some of the returned zip files are really large and BaseX seems to need to materialize the files in memory before writing them to disk resulting in out of memory errors for some files. Is there some way read the response in a more streamable fashion when using the http-module? I tried using the fetch:binary function successfully but i really need the more extended functionality of the http-module. #results in out of memory if the zip file is too large file:write-binary(file.zip, http:sendrequest(...)[2]) #works all the time. file:write-binary(file.zip, fetch:binary(...)) Regards, Johan ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] function parse-xml() very slow in embedded mode
Setting this option in the query proved useful declare option db:intparse yes; Halved the query time in the GUI and provided the same result and performance in embedded mode with the QueryProcessor. /Johan On Wed, Feb 19, 2014 at 10:26 AM, Johan Mörén hutchkint...@gmail.comwrote: Hi I'm using BaseX in embedded mode (no database) and need to transform an xml payload that contains elements containing string-encoded xml in this form ... strlt;objectInstance fetchDate=2014-02-14T13:00:53.374+01:00gt; lt;createdDateTimegt;2012-12-04T08:55:26.195Zlt;/createdDateTimegt; lt;/objectInstancegt;/str ... I have both tried using xquery update (copy, modify) and with a simple identity-transform function that intercepts the str/ element and replaces it with the results of parse-xml(.) When executing the query in BasexGUI both solutions performs very well. However if i invoke it via the QueryProcessor in my java-process it is 50 to 100 times slower. Any clues to what i might be missing? Regards, Johan Mörén ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
[basex-talk] Using BaseX as a standalone xquery library in Java
Anything special to think about when using BaseX outside the database-context? I have built some solutions using the QueryProcessor for generating filesystem reports using the supplied modules for calculating file-sizes and checksums. I hava also built a number of wrappers for calling and aggregating results from web-services. All these queries are embedded and run in our normal java-processes. Everything has worked out really well. And BaseX gives you access to a lot of good functionality like XQuery 3.0 and a rich set of modules. But I'm a bit curious if there are any drawback or special things to think about with using Basex as a general xquery-library like you would use Saxon for example? Regards, Johan Mörén ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk