This possibility to open zipped XMLs via doc() is awesome.
Universität Leipzig
Institute of Computer Science, NLP
Augustusplatz 10
04109 Leipzig
Deutschland
E-mail: cel...@informatik.uni-leipzig.de
E-mail: giuseppegacel...@gmail.com
Web site 1:
Yes, I build them, but I do not use them explicitly all the time.
> On Aug 13, 2018, at 12:04 AM, Liam R. E. Quin wrote:
>
> On Sun, 2018-08-12 at 23:58 +0200, Giuseppe Celano wrote:
>> more documents accessed sequentially is better than one
>> big file.
>
&g
in the database, as far as I can
see: more documents accessed sequentially is better than one big file.
Ciao,
Giuseppe
> On Aug 10, 2018, at 9:09 PM, Liam R. E. Quin wrote:
>
> On Fri, 2018-08-10 at 13:43 +0200, Giuseppe Celano wrote:
>> I uploaded the file, as it is, in the database,
&g
I uploaded it as csv (it is csv) via the GUI and it is then converted into XML
(this conversion probably makes it too big)
> On Aug 10, 2018, at 1:50 PM, Christian Grün wrote:
>
>> I uploaded the file, as it is, in the database
>
> So you uploaded the file as binary? Did you try to import it
g 10, 2018 at 1:36 PM Giuseppe Celano
> wrote:
>>
>> Hi,
>>
>> I am trying to work with a huge CSV file (about 380 MB), but If I built the
>> database it seems that even simple operations cannot be evaluated. Is
>> splitting the CSV file the only option or am I missing something here?
>> Thanks.
>>
>> Giuseppe
>>
>>
>
Hi,
I am trying to work with a huge CSV file (about 380 MB), but If I built the
database it seems that even simple operations cannot be evaluated. Is splitting
the CSV file the only option or am I missing something here? Thanks.
Giuseppe
I have to experiment more, but since I tried to copy many xml files (which can
take some time) and did not see a difference, I would be tempted to say that
maybe the problem is something else. But as soon as I have some time, I will
test it again and let you know.
> On Jul 24, 2018, at 9:55
I tried with and without xquery:fork-join and I do not see any real difference
as far as evaluation time is concerned. When it works, time gets,
approximately, halved.
In my "activity monitor" I can actually see more R processes started by BaseX,
but in the other case I cannot see any new
Hi Christian,
Thanks for the reply. My query is of the type (simplified (pseudo)code):
let $u := for $r in (list of document names)
let $dirToWrite := "/directory/" || $r
return
function () {
( file:write($dirToWrite,
Ok. So as I thought I would like to create different databases for different
projects, it seems that the best strategy is to have new (complete) basex
folders for each project. BaseX is so light that this does not seem an issue,
but still I was not sure this was the way to go.
Danke!
Giuseppe
DBPATH is
> a global option. It cannot be assigned at runtime; instead, it must be
> assigned before BaseX is started
>
> Ciao,
> Christian
>
> [1] http://docs.basex.org/wiki/Options
>
>
>
> On Mon, Jul 23, 2018 at 4:19 PM Giuseppe Celano
> wrote:
>>
Hi,
I would like to create a database in a directory which is not "data" within the
Basex folder. I used (within the GUI) the command
but it does not work. How can I specify that? Thanks.
Ciao,
Giuseppe
I am having fun with xquery:fork-join() and I see that it really reduces
evaluation time (!): I apply the same script to a collection of files, and if I
use xquery:fork-join() it takes about half of the time.
My computer has two cores. I was wondering what would happen if a computer had
more
Hi All,
I am not sure whether the serialize function is working properly (the first
example works, the second does not, because instead of tabs I get commas, and
there is no way to specify to add the header)
f
f
f
f
f
f
=> csv:serialize(map{"header":"yes",
> Which ordering criteria does this particular dictionary use?
It is the insertion order. I am just converting this code for pure fun and make
some tests. I will definitively have a look at Leo's code as well!
> On Jul 13, 2018, at 11:16 AM, Christian Grün
> wrote:
>
> Hi Giuseppe,
>
>> I
/
Web site 2: https://sites.google.com/site/giuseppegacelano/
> On Jul 13, 2018, at 12:57 AM, Giuseppe Celano
> wrote:
>
> Hi
>
> Is it possible to preserve the order of the keys in a map when the map is
> returned?:
>
> map{"b": 2, "c": 2, &q
Hi
Is it possible to preserve the order of the keys in a map when the map is
returned?:
map{"b": 2, "c": 2, "a": 3}
return
map {
"a": 3,
"b": 2,
"c": 2
}
Thanks!
Giuseppe
ike a too eager optimization. Did you have a chance to look at the
> resulting query plan?
>
>
>
> Giuseppe Celano <mailto:cel...@informatik.uni-leipzig.de>> schrieb am Fr., 6. Juli 2018,
> 22:40:
> I have noticed that in BaseX 9.0.2 a query like
>
> fgrtu/data(.)/rep
I have noticed that in BaseX 9.0.2 a query like
fgrtu/data(.)/replace(., "g", "h")
gets evaluated (returning "fhrtu"), while in BaseX 8.x, Exist, and Zorba I get
an error message (since, as expected, replace() is preceded not by a node but a
string).
Is this a bug?
Ciao,
Giuseppe
> for $snippet at $pos in $snippets
> where local-name($snippet) = 'non-match'
> return {
> $snippet/text() }
>
> Cheers,
> Christian
>
>
> On Fri, Jul 6, 2018 at 1:59 PM Giuseppe Celano
> wrote:
>>
>> Yes, fn:path (not fn:node)!
>>
>&g
Yes, fn:path (not fn:node)!
the following works
this is an example/nom/fn:path(.)
with the useful result
Q{http://www.w3.org/2005/xpath-functions}root()/Q{}nom[1]
but the following does not (because tokenize() does not return a node)
this is an example/tokenize(nom, " ")/fn:path(.)
what I
fn:node() returns the path to a node (including the text node): Is there a
similar function to get character offsets within a text node?
I am thinking of a case where, for example, one tokenizes a text within an
element and would like to get the xpath + offsets for every token.
> On Jul 6,
Thanks to both of you! This is very helpful. I will experiment with both
solutions.
Ciao,
Giuseppe
> On Jul 4, 2018, at 6:21 PM, Giuseppe Celano
> wrote:
>
> Hi All,
>
> I was wondering if there is a way to take full advantage of the arrow
> operator with file:write(
Hi All,
I was wondering if there is a way to take full advantage of the arrow operator
with file:write(). If I want to write the results of a query, it would be
ideal, I think, if the first parameter of file:write() were the content to
write and the second the path: in this case I could have:
check if the errors you reported in your last mail are
> dependent on the Java version you are using?
>
> Best,
> Christian
>
>
>
>
>
> Giuseppe Celano <mailto:cel...@informatik.uni-leipzig.de>> schrieb am Fr., 29. Juni 2018,
> 19:55:
> H
rün wrote:
>
> Hi Giuseppe,
>
> Did this happen with BaseX 8, too? Does it make a difference which
> Java version you are using?
>
> Cheers,
> Christian
>
>
> On Fri, Jun 29, 2018 at 5:04 PM Giuseppe Celano
> wrote:
>>
>> Hi,
>>
>&
e just filed an issue [1] for
> it.
>
> All the best,
> Alex
>
> [1] https://github.com/BaseXdb/basex/issues/1582
>
>> On 20. Jun 2018, at 21:28, Giuseppe Celano
>> wrote:
>>
>> Hi,
>>
>> I have updated Java (10 from 8) and I can
Hi,
I have updated Java (10 from 8) and I cannot apparently customize the GUI
anymore on my Mac (if I click on BaseXGUI > aboutBaseXGUI, I cannot access the
relevant tabs). Is this a known issue? Moreover, if I start the GUI from the
command line, I keep getting the warning message "Illegal
Hi Tim,
You can serialize your data as you prefer in BaseX [1]: therefore you can
easily make your computations in XML and then output whatever format is
required for your visualization tool.
For a fully automated approach, you can also take advantage of the Process
Module [2], which enables
- math:log10(1) returns -0 but -0 returns 0: is there a reason for that?
Thanks!
Giuseppe
Universität Leipzig
Institute of Computer Science, NLP
Augustusplatz 10
04109 Leipzig
Deutschland
E-mail: cel...@informatik.uni-leipzig.de
E-mail: giuseppegacel...@gmail.com
Web site 1:
s will be
> evaluated by the index.
>
> Thanks for the sample documents,
> Christian
>
> PS: 9.0.2 will be available until end of May.
>
> [1] http://files.basex.org/releases/latest/
>
>
>
> On Tue, May 22, 2018 at 5:22 PM, Giuseppe Celano
> <cel..
I think I have identified a problem with atomization of attribute content (no
database involved). I have a simple query:
for $s in doc("doc1")//s//t
for $d in doc("doc2")//case
where $d/verb_lemma = $s/@l and $d//verb_form/@value = $s/@f and
$d/aspect-values/@sign = "yes"
return
$s
In order
Hi Ben,
If you already use BaseX on a Linux machine, you already know how to use it on
a Mac :) Simply download and unzip the file
http://files.basex.org/releases/9.0.1/BaseX901.zip
and then click on BaseX.jar if you want to access the GUI quickly, or type one
of the commands in the bin
array_test.xql"
>>> true
>>> true
>>>
>>> When using the web server, I still get this:
>>>
>>> $ curl localhost:8994/rest?run=array_test.xql
>>> false
>>> true
>>> At first I thought there was some cache at
Hi Sebastian,
In my Basex 9.0.1 and 8.6.7 you get two "true".
Best,
Giuseppe
Universität Leipzig
Institute of Computer Science, NLP
Augustusplatz 10
04109 Leipzig
Deutschland
E-mail: cel...@informatik.uni-leipzig.de
E-mail: giuseppegacel...@gmail.com
Web site 1:
s,
> Christian
>
>
>
>> -Message d'origine-
>> De : basex-talk-boun...@mailman.uni-konstanz.de
>> [mailto:basex-talk-boun...@mailman.uni-konstanz.de] De la part de Giuseppe
>> Celano
>> Envoyé : lundi 23 avril 2018 16:53
>> À : Christian Grün
> Cheers,
> Christian
>
> [1] http://docs.basex.org/wiki/Database_Module#db:create
>
> On Mon, Apr 23, 2018 at 3:03 PM, Giuseppe Celano
> <cel...@informatik.uni-leipzig.de> wrote:
>> Hi All,
>>
>> I can create a database via the GUI, but if I use db:create [
Hi All,
I can create a database via the GUI, but if I use db:create [1] I get the
message "out of main memory": why? Thanks!
db:create("myDB",
"sourceDirectory",
"destinationDirectory",
map{"ftindex": true(), "language": false()}
)
Best,
Giuseppe
> [1]
> https://github.com/BaseXdb/basex/commit/9882669ad7b65bd51bc1d720c44d7c97df4685ff
> [2] http://files.basex.org/releases/8.6.7/
>
>
>
> On Wed, Apr 18, 2018 at 3:54 PM, Giuseppe Celano
> <cel...@informatik.uni-leipzig.de> wrote:
>> Hi,
>>
>>
Hi,
It seems there is an error with the collection function. Something like this:
collection("directory")[5]
does not return anything in 9.0 but it does in 8.6.7
Best,
Giuseppe
Universität Leipzig
Institute of Computer Science, Digital Humanities
Augustusplatz 10
04109 Leipzig
Deutschland
://sites.google.com/site/giuseppegacelano/
> On Apr 16, 2018, at 11:05 AM, Andy Bunce <bunce.a...@gmail.com> wrote:
>
> Hi Giuseppe,
>
> It has been moved to be the 1st button on the toolbar "New"(or ctl-T)
>
> /Andy
>
> On 16 April 2018 at 09:55, Giusep
I see that in the 9.0 version the "+ button" to add a new tab is missing. I
think it was very useful: can it be re-introduced in the following releases?
Best,
Giuseppe
Universität Leipzig
Institute of Computer Science
Augustusplatz 10
04109 Leipzig
Deutschland
E-mail:
; query as follows until 9.0.1 is released:
>
> for $ee in collection("my-path-to-files")
> where $ee//case/aspect-values/@sign = "yes"
> return $ee
>
> Hope this helps, thanks for the kudos,
> Christian
>
> [1] http://files.basex.org/releas
Hi All,
Thanks for this new release, which looks great!
I have found a problem though (see error message below), when running a query
like:
for $ee in collection("my-path-to-files")
where $ee//case[./aspect-values[@sign = "yes"]]
return
$ee
This works in version 8.6.7. The problem seems to be
Hi,
I write to ask whether it is now advisable to update to Java 9 (while using the
BaseX 8.6.x). Thanks.
Best,
Giuseppe
Hi All,
I would like to ask what the best way is in BaseX to create XPath expressions
once I identify a certain span in an XML file. More concretely, I usually
tokenize a text contained in an XML document, and I would like to specify for
each token its position in the original document.
Yes, this works!
Thanks,
Giuseppe
Universität Leipzig
Institute of Computer Science, Digital Humanities
Augustusplatz 10
04109 Leipzig
Deutschland
E-mail: cel...@informatik.uni-leipzig.de
E-mail: giuseppegacel...@gmail.com
Web site 1: http://www.dh.uni-leipzig.de/wo/team/
Web site 2:
Hi Christian,
It works only if I substitute your where clause with
where db:text("db2", $k)
Ciao,
Giuseppe
> On Sep 19, 2017, at 4:15 PM, Christian Grün wrote:
>
> where db:open("db2")/text/line[text() = $k]
I am using BaseX 8.6.4 and I am trying to do a group-by/order-by operation, and
I see that two logically equivalent queries perform very differently: one
cannot see the end, while the other can (and fast). I can provide further
details if necessary, but these are the queries (look at the last
100 "parse-json(file:read-text('example.json'))"
> basex -v -z -r100 "json-doc('example.json')"
>
> I tested the calls with a small and a large file (10 KB, 1.5 MB), and
> evaluation times were very similar, so I guess I need some more input
> to reproduce y
Thanks, Andy. I have also tried to invoke curl via proc:execute():
proc:execute("curl",("-F", "data=@example.txt", "-F", "tagger=", "-F",
"parser=", "http://lindat.mff.cuni.cz/services/udpipe/api/process; ))
The function works, but unfortunately the text inside the file is not
recognized as
Thanks, Kendall, I tried but it does not work :(
Universität Leipzig
Institute of Computer Science, Digital Humanities
Augustusplatz 10
04109 Leipzig
Deutschland
E-mail: cel...@informatik.uni-leipzig.de
E-mail: giuseppegacel...@gmail.com
Web site 1: http://www.dh.uni-leipzig.de/wo/team/
Web site
Hi Christian,
The latter option. I just opened a file and run the same query repeatedly. It
is not an in-depth comparison at all, but the times shown in the Query Info
were clearly different (even if just ms).
Best,
Giuseppe
Universität Leipzig
Institute of Computer Science, Digital
leipzig.de
E-mail: giuseppegacel...@gmail.com
Web site 1: http://www.dh.uni-leipzig.de/wo/team/
Web site 2: https://sites.google.com/site/giuseppegacelano/
> On 14 Aug 2017, at 14:11, Giuseppe Celano <cel...@informatik.uni-leipzig.de>
> wrote:
>
> Hi,
>
> I am accessi
Hi,
I have noticed different speeds when running the following functions (from
slowest to fastest):
parse-json(unparsed-text('example.txt'))
json-doc("example.txt")
parse-json(file:read-text('example.txt'))
similarly for documents on the web:
parse-doc('http://example.com/text')
Hi,
I am accessing a RESTful API via the following command:
curl -F data=@example.txt -F tokenizer= -F tagger= -F parser=
http://lindat.mff.cuni.cz/services/udpipe/api/process > example2.txt
I am wondering what the best way is to do that in BaseX. The service also has a
URL syntax, as shown
Hi Joe,
I am happy to hear you are also spreading the word! XQuery has a most clean
data model, and BaseX has implemented and extended the language so efficiently
and elegantly.
Best,
Giuseppe
Universität Leipzig
Institute of Computer Science, Digital Humanities
Augustusplatz 10
04109
re repeatedly accessed values in a map. This way, you can get
> evaluation times less than a second.
>
> Hope this helps,
> Christian
>
>
>
> On Thu, Jul 27, 2017 at 2:10 PM, Giuseppe Celano
> <cel...@informatik.uni-leipzig.de> wrote:
>> Hi Christian,
Universität Leipzig
Institute of Computer Science, Digital Humanities
Augustusplatz 10
04109 Leipzig
Deutschland
E-mail: cel...@informatik.uni-leipzig.de
E-mail: giuseppegacel...@gmail.com
Web site 1: http://www.dh.uni-leipzig.de/wo/team/
Web site 2: https://sites.google.com/site/giuseppegace
Hi,
I performed join operations between many files and a dictionary. The files
contain tokenized texts, where one finds word forms + fine-grained POS tags.
Look at the following file:
60 matches
Mail list logo