Re: [basex-talk] archive

2018-08-23 Thread Giuseppe Celano
This possibility to open zipped XMLs via doc() is awesome. Universität Leipzig Institute of Computer Science, NLP Augustusplatz 10 04109 Leipzig Deutschland E-mail: cel...@informatik.uni-leipzig.de E-mail: giuseppegacel...@gmail.com Web site 1:

Re: [basex-talk] Huge CSV

2018-08-12 Thread Giuseppe Celano
Yes, I build them, but I do not use them explicitly all the time. > On Aug 13, 2018, at 12:04 AM, Liam R. E. Quin wrote: > > On Sun, 2018-08-12 at 23:58 +0200, Giuseppe Celano wrote: >> more documents accessed sequentially is better than one >> big file. > &g

Re: [basex-talk] Huge CSV

2018-08-12 Thread Giuseppe Celano
in the database, as far as I can see: more documents accessed sequentially is better than one big file. Ciao, Giuseppe > On Aug 10, 2018, at 9:09 PM, Liam R. E. Quin wrote: > > On Fri, 2018-08-10 at 13:43 +0200, Giuseppe Celano wrote: >> I uploaded the file, as it is, in the database, &g

Re: [basex-talk] Huge CSV

2018-08-10 Thread Giuseppe Celano
I uploaded it as csv (it is csv) via the GUI and it is then converted into XML (this conversion probably makes it too big) > On Aug 10, 2018, at 1:50 PM, Christian Grün wrote: > >> I uploaded the file, as it is, in the database > > So you uploaded the file as binary? Did you try to import it

Re: [basex-talk] Huge CSV

2018-08-10 Thread Giuseppe Celano
g 10, 2018 at 1:36 PM Giuseppe Celano > wrote: >> >> Hi, >> >> I am trying to work with a huge CSV file (about 380 MB), but If I built the >> database it seems that even simple operations cannot be evaluated. Is >> splitting the CSV file the only option or am I missing something here? >> Thanks. >> >> Giuseppe >> >> >

[basex-talk] Huge CSV

2018-08-10 Thread Giuseppe Celano
Hi, I am trying to work with a huge CSV file (about 380 MB), but If I built the database it seems that even simple operations cannot be evaluated. Is splitting the CSV file the only option or am I missing something here? Thanks. Giuseppe

Re: [basex-talk] Parallelization

2018-07-24 Thread Giuseppe Celano
I have to experiment more, but since I tried to copy many xml files (which can take some time) and did not see a difference, I would be tempted to say that maybe the problem is something else. But as soon as I have some time, I will test it again and let you know. > On Jul 24, 2018, at 9:55

Re: [basex-talk] Parallelization

2018-07-24 Thread Giuseppe Celano
I tried with and without xquery:fork-join and I do not see any real difference as far as evaluation time is concerned. When it works, time gets, approximately, halved. In my "activity monitor" I can actually see more R processes started by BaseX, but in the other case I cannot see any new

Re: [basex-talk] Parallelization

2018-07-24 Thread Giuseppe Celano
Hi Christian, Thanks for the reply. My query is of the type (simplified (pseudo)code): let $u := for $r in (list of document names) let $dirToWrite := "/directory/" || $r return function () { ( file:write($dirToWrite,

Re: [basex-talk] Db directory

2018-07-23 Thread Giuseppe Celano
Ok. So as I thought I would like to create different databases for different projects, it seems that the best strategy is to have new (complete) basex folders for each project. BaseX is so light that this does not seem an issue, but still I was not sure this was the way to go. Danke! Giuseppe

Re: [basex-talk] Db directory

2018-07-23 Thread Giuseppe Celano
DBPATH is > a global option. It cannot be assigned at runtime; instead, it must be > assigned before BaseX is started > > Ciao, > Christian > > [1] http://docs.basex.org/wiki/Options > > > > On Mon, Jul 23, 2018 at 4:19 PM Giuseppe Celano > wrote: >>

[basex-talk] Db directory

2018-07-23 Thread Giuseppe Celano
Hi, I would like to create a database in a directory which is not "data" within the Basex folder. I used (within the GUI) the command but it does not work. How can I specify that? Thanks. Ciao, Giuseppe

[basex-talk] Parallelization

2018-07-22 Thread Giuseppe Celano
I am having fun with xquery:fork-join() and I see that it really reduces evaluation time (!): I apply the same script to a collection of files, and if I use xquery:fork-join() it takes about half of the time. My computer has two cores. I was wondering what would happen if a computer had more

[basex-talk] serialize vs csv:serialize

2018-07-20 Thread Giuseppe Celano
Hi All, I am not sure whether the serialize function is working properly (the first example works, the second does not, because instead of tabs I get commas, and there is no way to specify to add the header) f f f f f f => csv:serialize(map{"header":"yes",

Re: [basex-talk] maps

2018-07-13 Thread Giuseppe Celano
> Which ordering criteria does this particular dictionary use? It is the insertion order. I am just converting this code for pure fun and make some tests. I will definitively have a look at Leo's code as well! > On Jul 13, 2018, at 11:16 AM, Christian Grün > wrote: > > Hi Giuseppe, > >> I

Re: [basex-talk] maps

2018-07-13 Thread Giuseppe Celano
/ Web site 2: https://sites.google.com/site/giuseppegacelano/ > On Jul 13, 2018, at 12:57 AM, Giuseppe Celano > wrote: > > Hi > > Is it possible to preserve the order of the keys in a map when the map is > returned?: > > map{"b": 2, "c": 2, &q

[basex-talk] maps

2018-07-12 Thread Giuseppe Celano
Hi Is it possible to preserve the order of the keys in a map when the map is returned?: map{"b": 2, "c": 2, "a": 3} return map { "a": 3, "b": 2, "c": 2 } Thanks! Giuseppe

Re: [basex-talk] slash operator in Basex 9.0.2

2018-07-06 Thread Giuseppe Celano
ike a too eager optimization. Did you have a chance to look at the > resulting query plan? > > > > Giuseppe Celano <mailto:cel...@informatik.uni-leipzig.de>> schrieb am Fr., 6. Juli 2018, > 22:40: > I have noticed that in BaseX 9.0.2 a query like > > fgrtu/data(.)/rep

[basex-talk] slash operator in Basex 9.0.2

2018-07-06 Thread Giuseppe Celano
I have noticed that in BaseX 9.0.2 a query like fgrtu/data(.)/replace(., "g", "h") gets evaluated (returning "fhrtu"), while in BaseX 8.x, Exist, and Zorba I get an error message (since, as expected, replace() is preceded not by a node but a string). Is this a bug? Ciao, Giuseppe

Re: [basex-talk] Add line-number function

2018-07-06 Thread Giuseppe Celano
> for $snippet at $pos in $snippets > where local-name($snippet) = 'non-match' > return { > $snippet/text() } > > Cheers, > Christian > > > On Fri, Jul 6, 2018 at 1:59 PM Giuseppe Celano > wrote: >> >> Yes, fn:path (not fn:node)! >> >&g

Re: [basex-talk] Add line-number function

2018-07-06 Thread Giuseppe Celano
Yes, fn:path (not fn:node)! the following works this is an example/nom/fn:path(.) with the useful result Q{http://www.w3.org/2005/xpath-functions}root()/Q{}nom[1] but the following does not (because tokenize() does not return a node) this is an example/tokenize(nom, " ")/fn:path(.) what I

Re: [basex-talk] Add line-number function

2018-07-06 Thread Giuseppe Celano
fn:node() returns the path to a node (including the text node): Is there a similar function to get character offsets within a text node? I am thinking of a case where, for example, one tokenizes a text within an element and would like to get the xpath + offsets for every token. > On Jul 6,

Re: [basex-talk] file:write and arrow operator

2018-07-04 Thread Giuseppe Celano
Thanks to both of you! This is very helpful. I will experiment with both solutions. Ciao, Giuseppe > On Jul 4, 2018, at 6:21 PM, Giuseppe Celano > wrote: > > Hi All, > > I was wondering if there is a way to take full advantage of the arrow > operator with file:write(

[basex-talk] file:write and arrow operator

2018-07-04 Thread Giuseppe Celano
Hi All, I was wondering if there is a way to take full advantage of the arrow operator with file:write(). If I want to write the results of a query, it would be ideal, I think, if the first parameter of file:write() were the content to write and the second the path: in this case I could have:

Re: [basex-talk] GUI

2018-06-29 Thread Giuseppe Celano
check if the errors you reported in your last mail are > dependent on the Java version you are using? > > Best, > Christian > > > > > > Giuseppe Celano <mailto:cel...@informatik.uni-leipzig.de>> schrieb am Fr., 29. Juni 2018, > 19:55: > H

Re: [basex-talk] GUI

2018-06-29 Thread Giuseppe Celano
rün wrote: > > Hi Giuseppe, > > Did this happen with BaseX 8, too? Does it make a difference which > Java version you are using? > > Cheers, > Christian > > > On Fri, Jun 29, 2018 at 5:04 PM Giuseppe Celano > wrote: >> >> Hi, >> >&

Re: [basex-talk] GUI

2018-06-29 Thread Giuseppe Celano
e just filed an issue [1] for > it. > > All the best, > Alex > > [1] https://github.com/BaseXdb/basex/issues/1582 > >> On 20. Jun 2018, at 21:28, Giuseppe Celano >> wrote: >> >> Hi, >> >> I have updated Java (10 from 8) and I can

[basex-talk] GUI

2018-06-20 Thread Giuseppe Celano
Hi, I have updated Java (10 from 8) and I cannot apparently customize the GUI anymore on my Mac (if I click on BaseXGUI > aboutBaseXGUI, I cannot access the relevant tabs). Is this a known issue? Moreover, if I start the GUI from the command line, I keep getting the warning message "Illegal

Re: [basex-talk] Data analysis

2018-05-26 Thread Giuseppe Celano
Hi Tim, You can serialize your data as you prefer in BaseX [1]: therefore you can easily make your computations in XML and then output whatever format is required for your visualization tool. For a fully automated approach, you can also take advantage of the Process Module [2], which enables

[basex-talk] -math:log10(1)

2018-05-26 Thread Giuseppe Celano
- math:log10(1) returns -0 but -0 returns 0: is there a reason for that? Thanks! Giuseppe Universität Leipzig Institute of Computer Science, NLP Augustusplatz 10 04109 Leipzig Deutschland E-mail: cel...@informatik.uni-leipzig.de E-mail: giuseppegacel...@gmail.com Web site 1:

Re: [basex-talk] Atomization

2018-05-24 Thread Giuseppe Celano
s will be > evaluated by the index. > > Thanks for the sample documents, > Christian > > PS: 9.0.2 will be available until end of May. > > [1] http://files.basex.org/releases/latest/ > > > > On Tue, May 22, 2018 at 5:22 PM, Giuseppe Celano > <cel..

[basex-talk] Atomization

2018-05-22 Thread Giuseppe Celano
I think I have identified a problem with atomization of attribute content (no database involved). I have a simple query: for $s in doc("doc1")//s//t for $d in doc("doc2")//case where $d/verb_lemma = $s/@l and $d//verb_form/@value = $s/@f and $d/aspect-values/@sign = "yes" return $s In order

Re: [basex-talk] How to use BaseX on MacBook? (Urgent!)

2018-05-16 Thread Giuseppe Celano
Hi Ben, If you already use BaseX on a Linux machine, you already know how to use it on a Mac :) Simply download and unzip the file http://files.basex.org/releases/9.0.1/BaseX901.zip and then click on BaseX.jar if you want to access the GUI quickly, or type one of the commands in the bin

Re: [basex-talk] Unexpected unary lookup result

2018-05-11 Thread Giuseppe Celano
array_test.xql" >>> true >>> true >>> >>> When using the web server, I still get this: >>> >>> $ curl localhost:8994/rest?run=array_test.xql >>> false >>> true >>> At first I thought there was some cache at

Re: [basex-talk] Unexpected unary lookup result

2018-05-11 Thread Giuseppe Celano
Hi Sebastian, In my Basex 9.0.1 and 8.6.7 you get two "true". Best, Giuseppe Universität Leipzig Institute of Computer Science, NLP Augustusplatz 10 04109 Leipzig Deutschland E-mail: cel...@informatik.uni-leipzig.de E-mail: giuseppegacel...@gmail.com Web site 1:

Re: [basex-talk] database creation baseX.8.6.7

2018-04-23 Thread Giuseppe Celano
s, > Christian > > > >> -Message d'origine- >> De : basex-talk-boun...@mailman.uni-konstanz.de >> [mailto:basex-talk-boun...@mailman.uni-konstanz.de] De la part de Giuseppe >> Celano >> Envoyé : lundi 23 avril 2018 16:53 >> À : Christian Grün

Re: [basex-talk] database creation baseX.8.6.7

2018-04-23 Thread Giuseppe Celano
> Cheers, > Christian > > [1] http://docs.basex.org/wiki/Database_Module#db:create > > On Mon, Apr 23, 2018 at 3:03 PM, Giuseppe Celano > <cel...@informatik.uni-leipzig.de> wrote: >> Hi All, >> >> I can create a database via the GUI, but if I use db:create [

[basex-talk] database creation baseX.8.6.7

2018-04-23 Thread Giuseppe Celano
Hi All, I can create a database via the GUI, but if I use db:create [1] I get the message "out of main memory": why? Thanks! db:create("myDB", "sourceDirectory", "destinationDirectory", map{"ftindex": true(), "language": false()} ) Best, Giuseppe

Re: [basex-talk] Unicode problem with database

2018-04-18 Thread Giuseppe Celano
> [1] > https://github.com/BaseXdb/basex/commit/9882669ad7b65bd51bc1d720c44d7c97df4685ff > [2] http://files.basex.org/releases/8.6.7/ > > > > On Wed, Apr 18, 2018 at 3:54 PM, Giuseppe Celano > <cel...@informatik.uni-leipzig.de> wrote: >> Hi, >> >>

[basex-talk] Collection function

2018-04-17 Thread Giuseppe Celano
Hi, It seems there is an error with the collection function. Something like this: collection("directory")[5] does not return anything in 9.0 but it does in 8.6.7 Best, Giuseppe Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland

Re: [basex-talk] GUI

2018-04-16 Thread Giuseppe Celano
://sites.google.com/site/giuseppegacelano/ > On Apr 16, 2018, at 11:05 AM, Andy Bunce <bunce.a...@gmail.com> wrote: > > Hi Giuseppe, > > It has been moved to be the 1st button on the toolbar "New"(or ctl-T) > > /Andy > > On 16 April 2018 at 09:55, Giusep

[basex-talk] GUI

2018-04-16 Thread Giuseppe Celano
I see that in the 9.0 version the "+ button" to add a new tab is missing. I think it was very useful: can it be re-introduced in the following releases? Best, Giuseppe Universität Leipzig Institute of Computer Science Augustusplatz 10 04109 Leipzig Deutschland E-mail:

Re: [basex-talk] child node problem

2018-04-04 Thread Giuseppe Celano
; query as follows until 9.0.1 is released: > > for $ee in collection("my-path-to-files") > where $ee//case/aspect-values/@sign = "yes" > return $ee > > Hope this helps, thanks for the kudos, > Christian > > [1] http://files.basex.org/releas

[basex-talk] child node problem

2018-03-27 Thread Giuseppe Celano
Hi All, Thanks for this new release, which looks great! I have found a problem though (see error message below), when running a query like: for $ee in collection("my-path-to-files") where $ee//case[./aspect-values[@sign = "yes"]] return $ee This works in version 8.6.7. The problem seems to be

[basex-talk] update Java

2018-01-15 Thread Giuseppe Celano
Hi, I write to ask whether it is now advisable to update to Java 9 (while using the BaseX 8.6.x). Thanks. Best, Giuseppe

[basex-talk] XPath generator

2017-11-17 Thread Giuseppe Celano
Hi All, I would like to ask what the best way is in BaseX to create XPath expressions once I identify a certain span in an XML file. More concretely, I usually tokenize a text contained in an XML document, and I would like to specify for each token its position in the original document.

Re: [basex-talk] db:text() vs XPath

2017-09-19 Thread Giuseppe Celano
Yes, this works! Thanks, Giuseppe Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: cel...@informatik.uni-leipzig.de E-mail: giuseppegacel...@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2:

Re: [basex-talk] db:text() vs XPath

2017-09-19 Thread Giuseppe Celano
Hi Christian, It works only if I substitute your where clause with where db:text("db2", $k) Ciao, Giuseppe > On Sep 19, 2017, at 4:15 PM, Christian Grün wrote: > > where db:open("db2")/text/line[text() = $k]

[basex-talk] db:text() vs XPath

2017-09-19 Thread Giuseppe Celano
I am using BaseX 8.6.4 and I am trying to do a group-by/order-by operation, and I see that two logically equivalent queries perform very differently: one cannot see the end, while the other can (and fast). I can provide further details if necessary, but these are the queries (look at the last

Re: [basex-talk] Reading JSON

2017-08-15 Thread Giuseppe Celano
100 "parse-json(file:read-text('example.json'))" > basex -v -z -r100 "json-doc('example.json')" > > I tested the calls with a small and a large file (10 KB, 1.5 MB), and > evaluation times were very similar, so I guess I need some more input > to reproduce y

Re: [basex-talk] HTTP module

2017-08-14 Thread Giuseppe Celano
Thanks, Andy. I have also tried to invoke curl via proc:execute(): proc:execute("curl",("-F", "data=@example.txt", "-F", "tagger=", "-F", "parser=", "http://lindat.mff.cuni.cz/services/udpipe/api/process; )) The function works, but unfortunately the text inside the file is not recognized as

Re: [basex-talk] HTTP module

2017-08-14 Thread Giuseppe Celano
Thanks, Kendall, I tried but it does not work :( Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: cel...@informatik.uni-leipzig.de E-mail: giuseppegacel...@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site

Re: [basex-talk] Reading JSON

2017-08-14 Thread Giuseppe Celano
Hi Christian, The latter option. I just opened a file and run the same query repeatedly. It is not an in-depth comparison at all, but the times shown in the Query Info were clearly different (even if just ms). Best, Giuseppe Universität Leipzig Institute of Computer Science, Digital

Re: [basex-talk] HTTP module

2017-08-14 Thread Giuseppe Celano
leipzig.de E-mail: giuseppegacel...@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/ > On 14 Aug 2017, at 14:11, Giuseppe Celano <cel...@informatik.uni-leipzig.de> > wrote: > > Hi, > > I am accessi

[basex-talk] Reading JSON

2017-08-14 Thread Giuseppe Celano
Hi, I have noticed different speeds when running the following functions (from slowest to fastest): parse-json(unparsed-text('example.txt')) json-doc("example.txt") parse-json(file:read-text('example.txt')) similarly for documents on the web: parse-doc('http://example.com/text')

[basex-talk] HTTP module

2017-08-14 Thread Giuseppe Celano
Hi, I am accessing a RESTful API via the following command: curl -F data=@example.txt -F tokenizer= -F tagger= -F parser= http://lindat.mff.cuni.cz/services/udpipe/api/process > example2.txt I am wondering what the best way is to do that in BaseX. The service also has a URL syntax, as shown

Re: [basex-talk] Differences in serialization of arrays with JSON vs. adaptive methods

2017-08-10 Thread Giuseppe Celano
Hi Joe, I am happy to hear you are also spreading the word! XQuery has a most clean data model, and BaseX has implemented and extended the language so efficiently and elegantly. Best, Giuseppe Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109

Re: [basex-talk] Join operation and the database

2017-07-27 Thread Giuseppe Celano
re repeatedly accessed values in a map. This way, you can get > evaluation times less than a second. > > Hope this helps, > Christian > > > > On Thu, Jul 27, 2017 at 2:10 PM, Giuseppe Celano > <cel...@informatik.uni-leipzig.de> wrote: >> Hi Christian,

Re: [basex-talk] Join operation and the database

2017-07-27 Thread Giuseppe Celano
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: cel...@informatik.uni-leipzig.de E-mail: giuseppegacel...@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegace

[basex-talk] Join operation and the database

2017-07-27 Thread Giuseppe Celano
Hi, I performed join operations between many files and a dictionary. The files contain tokenized texts, where one finds word forms + fine-grained POS tags. Look at the following file: