Re: [basex-talk] file:read-text-lines performance

2019-01-15 Thread George Sofianos
There also looks to be a difference on how the read-text-lines is used. The following similar queries produce different Query paths, and have different memory usage. This is probably why I can't benefit from the update on more complex queries. 1) return count(file:read-text-lines($file,

Re: [basex-talk] file:read-text-lines performance

2019-01-15 Thread George Sofianos
Hi Christian, what I failed to mention last time was that I was using the offset / limit mode of the file:read-text-lines. I never tried to load the whole file into memory with the previous version, because I thought it would be inefficient. I just tried now with the latest snapshot using a

Re: [basex-talk] many distinct namespaces

2019-01-15 Thread Liam R. E. Quin
On Fri, 2018-11-02 at 17:25 +0100, Christian Grün wrote: did this ever happen? > > Some more details: The current storage layout per node has been fixed > to 16 bytes. One byte (8 bits) is reserved for the namespace > reference. Here are a couple of hacky appraches in the spirit of

[basex-talk] Documentation suggestion

2019-01-15 Thread Marco Lettere
Hi Christian, I suggest to add an explicit statement of which is the default value of the combine option of map:merge. Currently the only reference is in the Change notes at the bottom of the page. Thanks! Marco.

Re: [basex-talk] Concurrency WebDAV vs. REST API

2019-01-15 Thread Christian Grün
> The documentation perhaps does not explain the behavior properly. but anyway… > A write operation in any database should not block parallel read operations. > Read-last-committed as isolation is what one expect but not blocking of read > operations > until the end of the write transaction. It

Re: [basex-talk] file:read-text-lines performance

2019-01-15 Thread George Sofianos
Wow, thanks for your fast response! I will give it a try tonight, George. On 1/15/19 1:48 PM, Christian Grün wrote: Hi George, I’m glad to announce that files are now processed in an iterative manner [1,2]. That’s something I wanted to try a while ago, and your mail was another motivation to

Re: [basex-talk] file:read-text-lines performance

2019-01-15 Thread Christian Grün
Hi George, I’m glad to announce that files are now processed in an iterative manner [1,2]. That’s something I wanted to try a while ago, and your mail was another motivation to get it done. It works pretty fine: I reduced the JVM memory to a tiny maximum of 4mb, and I managed to count the line

Re: [basex-talk] file:read-text-lines performance

2019-01-15 Thread George Sofianos
Hi Christian, On 1/15/19 12:43 PM, Christian Grün wrote: What are your experiences with using a single thread? If memory consumption is too exhaustive, you could play with the window clause of the FLWOR expression [2,3]. It takes some time to explore the full magic of this XQuery 3.0 extension

Re: [basex-talk] file:read-text-lines performance

2019-01-15 Thread Christian Grün
Hi George, an interesting use case. Reading lines of a text file feels like a natural candidate for iterative processing. As we need to ensure that the accessed file will eventually be closed, it is completely parsed before its contents can be accessed (all this happens in [1]), In future, we

[basex-talk] Serialization issue with HTTP response

2019-01-15 Thread Marco Lettere
Hi all, we have to deal with a third-party REST service which in case of error conditions returns this mime type: /*application/problem+json; charset=utf-8*/. I wrote this RestXQ [1] to mock it. Just copy it into restxq.xqm... When I call it like [2] I get /*[XPTY0004] Cannot convert

[basex-talk] file:read-text-lines performance

2019-01-15 Thread George Sofianos
Hello, I'm trying to read a 4GB text file with 5 million lines and parse its contents. I'm using file:read-text-lines function to do that. I managed to use fork-join and use 16 CPU threads to read the whole file by reading 1