[basex-talk] Variable value

2023-06-14 Thread Giuseppe G. A. Celano
Hi,

I have stambled upon an unexpected issue concerning the value of a variable (in 
BaseX 10.6), which seems to be affected by an independent function. I have a 
function like

page:open-url($an1)

which takes a url as a parameter and opens a document. The url usually consists 
of an invariable part + a final number that identifies different documents. If 
I only pass the final number of the url,
the function, as expected, returns an error (because it tries to resolve a 
relative path pointing to no document). However, if I change the script into 

let $f := page:documents($an1, $an2)
return
page:open-url($an1)

the function page:open-url($an1) works properly (i.e., the presence of the 
independent let clause seems to affect the value of the variable $an1): I would 
not expect this in a functional programming language like XQuery. I have posted 
a gist here [1], where there are more details (look at the very end of the 
script)

Ciao,
Giuseppe


——
[1] https://gist.github.com/gcelano/31ef1880ac8439398c7f6de1de6d78d3






Re: [basex-talk] JSON serialization

2023-04-27 Thread Giuseppe G. A. Celano
Hi Christian,

This is the code:

{data()}

Ciao,
Giuseppe


> On 27. Apr 2023, at 13:38, Christian Grün  wrote:
> 
> Hi Giuseppe,
> 
> I’m sorry, I fail to understand how to simulate your use case. Could
> you please provide us with a minimized code snippet for testing?
> 
>> I have the comment node .
> 
> I assume it’s -- instead of —?
> 
>> If I extract its content with data() in an element, I get the following: 
>> #45;, and it seems there is no way to force #45; to become 
>>  within the element.
> 
> I tried this:
> 
> let $comment := 
> return element g { $comment }
> 
> It gives me . I assume it differs from your approach?
> 
> Grazie in anticipo,
> Christian
> 



Re: [basex-talk] JSON serialization

2023-04-27 Thread Giuseppe G. A. Celano
Hi Christian,

I have the comment node . If I extract its content with data() in an 
element, I get the following: #45;, and it seems there is no way to 
force #45; to become  within the element: I found a workaround to 
replace #45; with a dash '-', and this seems the best solution at the 
moment. My problem is that I have to refer to the positions of these characters 
(as character offsets), and therefore any change when I move a character from a 
comment node to an element node could break my reference system (indeed, 
#45; is not equivalent to  in BaseX: is the rendering of  as 
#45; correct?).

Ciao,
Giuseppe



> On 26. Apr 2023, at 18:06, Christian Grün  wrote:
> 
> Hi Giuseppe,
> 
>> However, I am having a hard time to deal with  “” ( using 
>> 

[basex-talk] JSON serialization

2023-04-26 Thread Giuseppe G. A. Celano
Hi all!

I have a few XML documents, whose content I would also like to provide in JSON. 
The XML files contain many strings encoded within comments, such as 

Re: [basex-talk] Pretty print

2022-11-17 Thread Giuseppe G. A. Celano
Hi,

it is:

declare option output:method 'xml';
declare option output:indent 'yes’;

doc(“myfile.xml”)


Best,
Giuseppe


Dr. Giuseppe G. A. Celano
DFG-project leader
Universität Leipzig
Institute of Computer Science, NLP
Augustusplatz 10
Tel: +4934132223
04109 Leipzig
Deutschland

E-mail: cel...@informatik.uni-leipzig.de
Web site 1: http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano 
Web site 2: https://sites.google.com/site/giuseppegacelano/




> On 17. Nov 2022, at 14:15, Martin Honnen  wrote:
> 
> 
> Am 11/17/2022 um 2:10 PM schrieb Giuseppe G. A. Celano:
>> Hi,
>> 
>> I am trying to prettyprint an XML file. I tried the serialization option 
>> “indent”=“yes”, but it does not work as expected. On BaseX 9, the 
>> prettyprint was the default setting: how to get the same result in BaseX 10 
>> (and later)? Thanks.
>> 
> 
> Can you show us your code?
> 
> For me
> 
> declare namespace output =
> "http://www.w3.org/2010/xslt-xquery-serialization;;
> 
> declare option output:method 'xml';
> declare option output:indent 'yes';
> 
> declare context item := bar;
> 
> .
> 
> 
> gives the result
> 
> 
>   bar
> 
> 
> 
> in 10.3.
> 
> 
> 



[basex-talk] Pretty print

2022-11-17 Thread Giuseppe G. A. Celano
Hi,

I am trying to prettyprint an XML file. I tried the serialization option 
“indent”=“yes”, but it does not work as expected. On BaseX 9, the prettyprint 
was the default setting: how to get the same result in BaseX 10 (and later)? 
Thanks.

Best,
Giuseppe

[basex-talk] Java 11

2022-09-13 Thread Giuseppe G. A. Celano
Hi All,

On the BaseX website (https://basex.org/download/ 
), it is specified that “Java 11 is required to 
install and run BaseX (10.1)” . Since it is not specified “Java 11 or higher”, 
I am wondering whether this requirement is actually meant to be strict. Thanks!

Best,
Giuseppe







Re: [basex-talk] Integers as attribute values

2022-04-26 Thread Giuseppe G. A. Celano
Thank you, Christian, as always very useful!


> On 26. Apr 2022, at 19:54, Christian Grün  wrote:
> 
> Hi Giuseppe,
> 
> Is this due to the fact that BaseX tries to convert the values of @n of all 
> div elements into a number and, if it happens that the @n values returned are 
> all numbers, then an error is not raised (the comparison is then possible), 
> otherwise it is? Is this BaseX specific? Thanks.
> 
> Exactly; that’s standard XQuery. You can use an additional predicate to 
> restrict the comparisons to valid integers:
> 
> 
>   
>
>
> //div[@n[. castable as xs:integer][. = 21]]
> 
> Using string comparisons is usually the easier way, and it’s better supported 
> by our index structures. However, if your numeric data is irregular and has 
> additional whitespaces (such as in the third  element in my example), 
> string comparisons may be too strict.
> 
> A side note: I was surprised to see that Saxon EE 10 raises an exception for 
> my example query. It seems that the predicates are swapped, and the 
> comparison is evaluated before the cast check. Maybe the behavior has been 
> adapted in a more recent version. – A workaround is to use the and expression:
> 
> a[. castable as xs:integer and . = 21]
> 
> In BaseX, a predicate with an and expression will automatically be rewritten 
> to multiple predicates, as simple predicates can be further optimized more 
> easily, and predicates will always be evaluated in the order they are written 
> down.
> 
> Ciao,
> Christian



[basex-talk] Integers as attribute values

2022-04-26 Thread Giuseppe G. A. Celano
Hi Everyone,

I have an xml document with elements such as . If I run the query 
doc(“file.xml")//div[@subtype="chapter"]//*/parent::div[@n=21], I get the 
relevant div element, even if 21 is passed as an integer. On the other hand,
if I type doc(“file.xmll")//div[@n=21], I get the error  "Cannot convert to 
xs:double”, which can be solved by writing  doc(“myfile.xmll")//div[@n=“21”]. 

Is this due to the fact that BaseX tries to convert the values of @n of all div 
elements into a number and, if it happens that the @n values returned are all 
numbers, then an error is not raised (the comparison is then possible), 
otherwise it is? Is this BaseX specific? Thanks.

Best,
Giuseppe







Re: [basex-talk] file:read-text

2021-01-12 Thread Giuseppe G. A. Celano
I have found that a few files are not txt, but binary ones. I can open them 
with file:read-binary. However, the reference to XML can be confusing because I 
am not dealing with XML.




> On 12. Jan 2021, at 13:41, Martin Honnen  wrote:
> 
> 
> 
> On 12.01.2021 13:33, Giuseppe G. A. Celano wrote:
>> Hi,
>> 
>> I am trying to open a bunch of files with file:read-text 
>> , but I get the error [file:io-error ] 
>> invalid XML character: #0, even I am not dealing with XML. Any idea why this 
>> happens? Thanks!
>> 
>> 
> Perhaps as XQuery strings impose XML rules for allowed characters? Can you 
> read out the files as binary hex or base64?



[basex-talk] file:read-text

2021-01-12 Thread Giuseppe G. A. Celano
Hi,

I am trying to open a bunch of files with file:read-text, but I get the error 
[file:io-error] invalid XML character: #0, even I am not dealing with XML. Any 
idea why this happens? Thanks!

Best,
Giuseppe





[basex-talk] Database access in a function

2020-12-18 Thread Giuseppe G. A. Celano
Hi,

I have a script where I join two databases. It works. However, when I try to 
put the content of this script in a function, and then call it (with the 
databases being the arguments), the execution is slowed down (I guess the 
indexes are not properly accessed). Is there a way to overcome this? Thanks.







Re: [basex-talk] Sequence comparison

2020-11-26 Thread Giuseppe G. A. Celano
Hi Martin,

What I was looking for is a “quick way” to get a comparison such that an item 
(when it is repeated more than once) is returned only as many times as it 
appears in both sequences. For example, in 

for $a in (1,2,3,5,3,3)
where $a = (1,3,2,3, 2)
group by $k := $a
return
{$a}

you get 

1
2
3 3 3

with three 3 even if in the second sequence you have two 3.


> On 26. Nov 2020, at 08:30, Martin Honnen  wrote:
> 
> Am 26.11.2020 um 07:14 schrieb Martin Honnen:
>> My bad, use
>> group by $k := $a return {$a}
> 
> To describe it clearer, the whole FLOWR should be
> 
> for $a in (1,2,3,5)
> where $a = (1,3,2,3)
> group by $k := $a
> return
> {$a}
> 



Re: [basex-talk] Sequence comparison

2020-11-25 Thread Giuseppe G. A. Celano
Unfortunately this does not work because, if the second sequence has only one 3 
(and the first has two 3), I will still get two 3, while I should get only one.

> 
> Or more FLOWR like
>  for $a  in (1,2,3,5,3)
>  where $a . = (1,3,2)
>  group by $o := $a
>  return
>  {$a}






> On 25. Nov 2020, at 09:11, Martin Honnen  wrote:
> 
> Am 25.11.2020 um 08:39 schrieb Martin Honnen:
>> Am 25.11.2020 um 06:37 schrieb Giuseppe G. A. Celano:
>> 
>>> I have to compare two sequences and find common items, irrespective of
>>> their positions:
>>> 
>>> for $a  in (1,2,3,5,3)
>>> for $u  in (1,3,2,3)
>>> where $a = $u
>>> group by $o := $a
>>> return
>>> {$a}
>>> 
>>> This returns
>>> 1
>>> 2
>>> 3 3 3 3
>>> 
>>> I would like 3 to be repeated only twice (i.e., each item in the first
>>> sequence should pair only with one in the second sentence): is there an
>>> XQuery "trick" for that in the FLOWR expression?
>> 
>> I think
>> 
>> for $a  in (1,2,3,5,3)[. = (1,3,2,3)]
>> group by $o := $a
>> return
>> {$a}
>> 
>> would do that
> 
> Or more FLOWR like
>  for $a  in (1,2,3,5,3)
>  where $a . = (1,3,2,3)
>  group by $o := $a
>  return
>  {$a}
> 



[basex-talk] Sequence comparison

2020-11-24 Thread Giuseppe G. A. Celano
Hi,

I have to compare two sequences and find common items, irrespective of their 
positions:

for $a  in (1,2,3,5,3)
for $u  in (1,3,2,3)
where $a = $u
group by $o := $a
return
{$a}

This returns 
1
2
3 3 3 3

I would like 3 to be repeated only twice (i.e., each item in the first sequence 
should pair only with one in the second sentence): is there an XQuery "trick" 
for that in the FLOWR expression?







[basex-talk] Improper use/potential bug error

2020-11-21 Thread Giuseppe G. A. Celano
Hi,

I got an "Improper use? Potential bug?” error (see below) with a (complex) 
query which actually works when applied to many files (but not to all). This is 
the query (temporary links):

https://git.informatik.uni-leipzig.de/celano/latinnlp/-/blob/master/scripts/03.00_normalize_spelling.xq
 


applied to all files here:

https://git.informatik.uni-leipzig.de/celano/latinnlp/-/tree/master/texts/parsed-texts

Essentially, I have some words associated to an XPath expression + offsets 
(see, for example, [1]), which (words) I can retrieve by using 
substring($xpath, $start, $long) applied to [2]. My goal is to try to identify 
the text nodes the words are children of, because I want to check whether they 
are contained in some specific elements.

Ciao,
Giuseppe


[1] 
https://git.informatik.uni-leipzig.de/celano/latinnlp/-/blob/master/texts/parsed-texts/phi2331.phi005.perseus-lat2/phi2331.phi005.perseus-lat2.tok01.xml
 

[2] 
https://git.informatik.uni-leipzig.de/celano/latinnlp/-/blob/master/texts/parsed-texts/phi2331.phi005.perseus-lat2/phi2331.phi005.perseus-lat2.xml
 

---

Error:
Improper use? Potential bug? Your feedback is welcome:
Contact: basex-talk@mailman.uni-konstanz.de 

Version: BaseX 9.3.2
Java: Ubuntu, 11.0.9.1
OS: Linux, amd64
Stack Trace: 
java.lang.ArrayIndexOutOfBoundsException: Maximum array size reached.
at org.basex.util.Array.newSize(Array.java:299)
at org.basex.util.Array.newSize(Array.java:288)
at org.basex.util.TokenBuilder.add(TokenBuilder.java:267)
at org.basex.util.TokenBuilder.add(TokenBuilder.java:252)
at org.basex.query.QueryInfo.toString(QueryInfo.java:142)
at org.basex.query.QueryContext.info 
(QueryContext.java:474)
at org.basex.query.QueryProcessor.info 
(QueryProcessor.java:272)
at org.basex.core.cmd.AQuery.extError(AQuery.java:212)
at org.basex.core.cmd.AQuery.query(AQuery.java:130)
at org.basex.core.cmd.XQuery.run(XQuery.java:22)
at org.basex.core.Command.run(Command.java:257)
at org.basex.core.Command.execute(Command.java:93)
at org.basex.gui.GUI.exec(GUI.java:416)
at org.basex.gui.GUI.lambda$execute$4(GUI.java:359)
at java.base/java.lang.Thread.run(Thread.java:834)


[basex-talk] in comments

2020-11-17 Thread Giuseppe G. A. Celano
Hi, 

/data() is printed as  , while "” as -  (even if both are 
xs:string): is there a reason? In any case, is there a function to convert 
 into - ? Thanks.

Best,
Giuseppe






Re: [basex-talk] Progress bar

2020-11-11 Thread Giuseppe G. A. Celano
Hi Christian,

Great! I am using trace((), “my message"). Do you know if there is a way to 
avoid printing the parentheses ()?

Ciao,
Giuseppe




> On 11. Nov 2020, at 12:19, Christian Grün  wrote:
> 
> Hi Giuseppe,
> 
>> I have written a BaseX script (which applies a few functions to files in a 
>> directory), which I run at the command line. Is there a "trick" to get a 
>> progress bar for that (other than calling BaseX from a different programming 
>> language having some progress bar library)? Thanks.
> 
> If you run your script on command line, I guess you’d want to have the
> progress bar rendering on command line as well, right?
> 
> I would probably include some prof:dump or fn:trace calls in the code
> (the output cannot cope with a real progress bar, though).
> 
> Cheers,
> Christian
> 



[basex-talk] Progress bar

2020-11-11 Thread Giuseppe G. A. Celano
I have written a BaseX script (which applies a few functions to files in a 
directory), which I run at the command line. Is there a "trick" to get a 
progress bar for that (other than calling BaseX from a different programming 
language having some progress bar library)? Thanks.

Best,
Giuseppe






Re: [basex-talk] Joining large files

2020-07-13 Thread Giuseppe G. A. Celano
Hi Christian,

Thank you so much for your quick answer! The scripts you give both work 
efficiently on my end! I actually forgot about the use of pragmas, but I tried 
to force the use of indexes by specifying data()/text nodes, but they did not 
work. On the contrary, I remembered that maps can “do the trick", so I first 
converted XML into JSON, and then tried to merge the files, but it did not work 
either. If it is of interest to you, I uploaded the files and query here:

script:
https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/join_json_files.xq
 
<https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/join_json_files.xq>
1st file:  
https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/hib_parses.json
 
<https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/hib_parses.json>
2nd file: 
https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/hib_lemmas.json
 
<https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/hib_lemmas.json>

Thanks again for your help!

Ciao,
Giuseppe


> On 12. Jul 2020, at 15:46, Christian Grün  wrote:
> 
> One more solution that should be evaluated faster (the data to be
> looked up is directly stored in a map):
> 
> declare variable $hib_parses:= db:open('hib_parses');
> declare variable $hib_lemmas := db:open('hib_lemmas');
> 
> let $lemmas := map:merge(
>  for $row in $hib_lemmas//row
>  where $row/field[@name = 'lemma_lang_id'] = '3'
>  return map:entry($row/field[@name = 'lemma_id'], $row)
> , map { 'duplicates': 'combine'})
> 
> for $parse in $hib_parses//row
> for $lemma in $lemmas($parse/field[@name = 'lemma_id'])
> return (# db:copynode false #) {
>  element wf  {
>    { $parse/* },
>{ $lemma/* }
>  }
> }
> 
> 
> 
> On 7/11/20, Giuseppe G. A. Celano  wrote:
>> Hi,
>> 
>> I am trying to perform a join operation between two large XML files (~490 MB
>> and ~40 MB), which are the result of the automatic conversion of old sql
>> dumps into XML files. I created two databases for the files. The query I
>> wrote to join them is correct because it works when I limit the join to just
>> a few items, but it never ends if I apply it to all items:
>> 
>> here is the xquery:
>> https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/join_files.xq
>> <https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/join_files.xq>
>> here is the first file:
>> https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/hib_parses.xml
>> <https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/hib_parses.xml>
>> here is the second file:
>> https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/hib_lemmas.xml
>> <https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/hib_lemmas.xml>
>> 
>> I have also tried to use the database module functions, but without success.
>> Am I missing anything here? Thanks.
>> 
>> Ciao,
>> Giuseppe



Re: [basex-talk] Joining large files

2020-07-11 Thread Giuseppe G. A. Celano
It is the remnant of a previous version of the script, but it does not affect 
the query, as far as I have seen. It is deleted now.


> On Jul 11, 2020, at 3:05 PM, Martin Honnen  wrote:
> 
> Am 11.07.2020 um 14:41 schrieb Giuseppe G. A. Celano:
> 
>> I am trying to perform a join operation between two large XML files
>> (~490 MB and ~40 MB), which are the result of the automatic conversion
>> of old sql dumps into XML files. I created two databases for the files.
>> The query I wrote to join them is correct because it works when I limit
>> the join to just a few items, but it never ends if I apply it to all items:
>> 
>> here is the xquery:
>> https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/join_files.xq
> 
> Isn't the
>   where $nn
> kind of meaningless? I don't think you can have an empty sequence $nn,
> as you don't use `allowing empty` when you bind that variable in the
> nested `for`.
> 
> No idea of course whether that changes the problem you encounter.
> 
> 
> 
> 



[basex-talk] Joining large files

2020-07-11 Thread Giuseppe G. A. Celano
Hi,

I am trying to perform a join operation between two large XML files (~490 MB 
and ~40 MB), which are the result of the automatic conversion of old sql dumps 
into XML files. I created two databases for the files. The query I wrote to 
join them is correct because it works when I limit the join to just a few 
items, but it never ends if I apply it to all items:

here is the xquery: 
https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/join_files.xq
 

here is the first file: 
https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/hib_parses.xml
 

here is the second file:  
https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/hib_lemmas.xml
 


I have also tried to use the database module functions, but without success. Am 
I missing anything here? Thanks.

Ciao,
Giuseppe

[basex-talk] Position of attributes using copy/modify

2020-07-06 Thread Giuseppe G. A. Celano
Hi,

I know that the order of attributes in XML is not relevant, but I am wondering 
whether there is a way to add an attribute as the last one using copy/modify 
expression. The “as last” option seems not to work for attributes.

Cheers,
Giuseppe

[basex-talk] Writing

2020-06-11 Thread Giuseppe G. A. Celano
Hi,

I would like to print a comment containing only a dash (i.e., ) , but 
this is not allowed. I tried to use "" instead of - in comment { "” 
}, but such a string is converted to ‘-', so the error is still there. How can 
I print  ? Should I simply use <>? In any case, I see that 
data() does not convert  into -: how can I get the conversion? 
Thanks.

Best,
Giuseppe

[basex-talk] Text index

2019-12-12 Thread Giuseppe G. A. Celano
Hi,

I am wondering whether it is possible to use indexes in a script even if a 
database is not created explicitly/previously. I see that sometimes, when 
comparing different XML texts accessed via doc(), text indexes are 
automatically created/used, but it is not clear to me whether/how I can 
specify/force the use of them also in the script. Thanks.

Best,
Giuseppe

Re: [basex-talk] running in parallel

2019-12-09 Thread Giuseppe G. A. Celano
I forgot to mention that I often use fork-join() with proc:execute() to run in 
parallel more than one instance of the OCR engine “tesseract” (I can have more 
than 1000 images to OCR): it works fabulously (up to 8 processes on my 
Quad-core Intel Core i7). More in general, when it comes to running system 
programs, fork-join() + proc:execute() is extremely useful.

Giuseppe


> On 9. Dec 2019, at 18:54, Giuseppe G. A. Celano 
>  wrote:
> 
> Thanks for your answers!
> 
> I have run an experiment, and I confirm that fork-join() actually works, even 
> if the gain is not as expected. Most importantly, I noticed that the amount 
> of RAM made available is crucial: with 2MB the sequential script was very 
> slow, while with 5/7MB it works fine.
> 
> (2,8 GHz Quad-Core Intel Core i7)
> 
> Sequential: about 47 s.
> fork-join: about 40 s.
> GNU parallel: about 30 s.
> 
> Best,
> Giuseppe
> 
> 
>> On 9. Dec 2019, at 16:58, Omar Siam  wrote:
>> 
>> Hi,
>> 
>> I see the same in my application. My two cent of wisdom: I would say most 
>> disks today will be fast enough to mask this problem. Let alone SSDs that 
>> can happily fetch two files at the (almost) same time. But the thing is: The 
>> exist code uses some pretty heavy locks to make sure no two Java threads 
>> access the same (database) file at the same time. And unless this is really 
>> given some thought for data safety I am glad that it does not allow queries 
>> to run in parallel. I would love to solve this in a more state of the art 
>> way but got burned in the past by multi threading. So I have great respect 
>> for any good, safe and fast implementation multi threading file access 
>> implementation. I fear no one did one yet for BaseX.
>> 
>> Best regards
>> 
>> Omar Siam
>> 
>> Am 08.12.2019 um 17:04 schrieb Markus Wittenberg:
>>> Hi Giuseppe,
>>> 
>>> as long as the files are not on physically different disks, you will have 
>>> the two functions block each other with read and write operations. And 
>>> BaseX runs lots of code in parallel without you explicitly telling it so.
>>> 
>>> Best regards,
>>> 
>>> Markus
>>> 
>>> Am 08.12.2019 um 16:48 schrieb cel...@informatik.uni-leipzig.de:
>>>> Hi,
>>>> 
>>>> I am trying to run two BaseX scripts in parallel using:
>>>> 
>>>> xquery:fork-join(
>>>> (
>>>> function() {xquery:eval(xs:anyURI('extract_from_ocr1.xq'))}
>>>> ,
>>>> function (){xquery:eval(xs:anyURI('extract_from_ocr2.xq'))}
>>>> )
>>>>   )
>>>> 
>>>> As far as I can understand (read below), the scripts are kind of run in 
>>>> parallel, but still the time benefit of that does not seem much in 
>>>> comparison with running in sequence (~25s vs ~28s). The files contain the 
>>>> same function, which reads files from a directory, performs some 
>>>> calculation, and saves the result in a file (the two scripts work on 
>>>> different directories). I infer that the previous script is run in 
>>>> parallel because the files for the results are created at the same time.
>>>> 
>>>> I tried to do the same with GNU parallel, and in that case the files are 
>>>> actually run in parallel.
>>>> 
>>>> Do we know why the execution time is not (more or less) halved in BaseX? 
>>>> Thanks.
>>>> 
>>>> Ciao,
>>>> Giuseppe
>>>> 
> 
> 



Re: [basex-talk] running in parallel

2019-12-09 Thread Giuseppe G. A. Celano
Thanks for your answers!

I have run an experiment, and I confirm that fork-join() actually works, even 
if the gain is not as expected. Most importantly, I noticed that the amount of 
RAM made available is crucial: with 2MB the sequential script was very slow, 
while with 5/7MB it works fine.

(2,8 GHz Quad-Core Intel Core i7)

Sequential: about 47 s.
fork-join: about 40 s.
GNU parallel: about 30 s.

Best,
Giuseppe


> On 9. Dec 2019, at 16:58, Omar Siam  wrote:
> 
> Hi,
> 
> I see the same in my application. My two cent of wisdom: I would say most 
> disks today will be fast enough to mask this problem. Let alone SSDs that can 
> happily fetch two files at the (almost) same time. But the thing is: The 
> exist code uses some pretty heavy locks to make sure no two Java threads 
> access the same (database) file at the same time. And unless this is really 
> given some thought for data safety I am glad that it does not allow queries 
> to run in parallel. I would love to solve this in a more state of the art way 
> but got burned in the past by multi threading. So I have great respect for 
> any good, safe and fast implementation multi threading file access 
> implementation. I fear no one did one yet for BaseX.
> 
> Best regards
> 
> Omar Siam
> 
> Am 08.12.2019 um 17:04 schrieb Markus Wittenberg:
>>  Hi Giuseppe,
>> 
>> as long as the files are not on physically different disks, you will have 
>> the two functions block each other with read and write operations. And BaseX 
>> runs lots of code in parallel without you explicitly telling it so.
>> 
>> Best regards,
>> 
>> Markus
>> 
>> Am 08.12.2019 um 16:48 schrieb cel...@informatik.uni-leipzig.de:
>>> Hi,
>>> 
>>> I am trying to run two BaseX scripts in parallel using:
>>> 
>>> xquery:fork-join(
>>> (
>>> function() {xquery:eval(xs:anyURI('extract_from_ocr1.xq'))}
>>> ,
>>> function (){xquery:eval(xs:anyURI('extract_from_ocr2.xq'))}
>>> )
>>>)
>>> 
>>> As far as I can understand (read below), the scripts are kind of run in 
>>> parallel, but still the time benefit of that does not seem much in 
>>> comparison with running in sequence (~25s vs ~28s). The files contain the 
>>> same function, which reads files from a directory, performs some 
>>> calculation, and saves the result in a file (the two scripts work on 
>>> different directories). I infer that the previous script is run in parallel 
>>> because the files for the results are created at the same time.
>>> 
>>> I tried to do the same with GNU parallel, and in that case the files are 
>>> actually run in parallel.
>>> 
>>> Do we know why the execution time is not (more or less) halved in BaseX? 
>>> Thanks.
>>> 
>>> Ciao,
>>> Giuseppe
>>> 



Re: [basex-talk] Join

2019-11-29 Thread Giuseppe G. A. Celano
Hi Christian,

Thank you very much for this detailed explanation! If I understand correctly, 
the index option, which makes everything faster, is an optimization that is 
independent from XQuery per se. This explains why it is activated only under 
certain circumstances, independently from the fact that two XQuery expressions 
are supposed to return the same result. Thanks.

Best,
Giuseppe



> On Nov 28, 2019, at 5:54 PM, Christian Grün  wrote:
> 
> Hi Giuseppe,
> 
> Thanks for passing me on your data sets. Some background information:
> 
> • If you look at the query info, you’ll see that your query won’t be
> rewritten for index access.
> 
> • Without index access, your query will need to perform the impressive
> amount of 1440254 * 17573 = 25 billion comparisons.
> 
> • The optimized version of the query with text() steps can be
> evaluated much faster, as it utilizes both the text and the attribute
> index:
> 
>  db:text("hib_parses", db:attribute("hib_lemmas", "lemma_id") 
> /parent::row)
> 
> • A and A/text() cannot be treated identically by the query processor:
> A text node may have more than one text node (an example:
> a<_/>b). The atomized result will always be a single value,
> whereas A/text() will give you two values.
> 
> • In some cases, the optimizer will implicitly add text nodes to path
> expressions if it’s a) possible at compile time to determine that a
> given step has only single text nodes, and b) the query will not yield
> different results. In the next step, paths with trailing text() steps
> may then be rewritten for index access.
> 
> • Some optimizations are restricted to documents without namespaces.
> Adding the text() step is one of them, so this could be the reason why
> you need to add this step manually.
> 
> Hope this helps,
> Christian
> 
> PS: I will see if there’s a chance to enable the discussed
> optimization for documents with namespaces.
> 
> 
> 
> On Thu, Nov 28, 2019 at 1:45 AM Giuseppe G. A. Celano
>  wrote:
>> 
>> Hi,
>> 
>> I have the following query:
>> 
>> count(
>> for $r in doc("hib_parses.xml")//row
>> let $i := doc("hib_lemmas.xml")//row[field[@name="lemma_lang_id"][. = "3"]]
>> where $r/field[@name="lemma_id"] = $i/field[@name="lemma_id"]
>> return $r
>> )
>> 
>> I have noticed that the where clause needs to be changed into 
>> $r/field[@name="lemma_id"]/text() = $i/field[@name="lemma_id"]/text() in 
>> order to get a result (otherwise the query seems to never end).
>> I am wondering whether this is a BaseX issue, in that I would assume that 
>> the two kinds of where clause are equivalent (because of atomization). I 
>> have also noticed that /data() does not work either. Thanks!
>> 
>> Best,
>> Giuseppe
> 



[basex-talk] Join

2019-11-27 Thread Giuseppe G. A. Celano
Hi,


I have the following query:

count(

for $r in doc("hib_parses.xml")//row

let $i := doc("hib_lemmas.xml")//row[field[@name="lemma_lang_id"][. = "3"]]

where $r/field[@name="lemma_id"] = $i/field[@name="lemma_id"]

return 
$r

)

I have noticed that the where clause needs to be changed into 
$r/field[@name="lemma_id"]/text() = $i/field[@name="lemma_id"]/text() in order 
to get a result (otherwise the query seems to never end).
I am wondering whether this is a BaseX issue, in that I would assume that the 
two kinds of where clause are equivalent (because of atomization). I have also 
noticed that /data() does not work either. Thanks!

Best,
Giuseppe

Re: [basex-talk] proc:system

2019-11-13 Thread Giuseppe G. A. Celano
I would expect just some text, as with proc:system("ls")


  Usage:
  /usr/local/bin/tesseract --help | --help-extra | --version
  /usr/local/bin/tesseract --list-langs
  /usr/local/bin/tesseract imagename outputbase [options...] [configfile...]

OCR options:
  -l LANG[+LANG]Specify language(s) used for OCR.
NOTE: These options must occur before any configfile.

Single options:
  --helpShow this help message.
  --help-extra  Show extra help for advanced users.
  --version Show version information.
  --list-langs  List available languages for tesseract engine.

  1


I guess that the "1" code blocks the printing. If I use 
proc:system("/usr/local/bin/tesseract", "--help"), it works.

E-mail: cel...@informatik.uni-leipzig.de 

Web site 1: http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano 
 
Web site 2: https://sites.google.com/site/giuseppegacelano/ 


> On Nov 13, 2019, at 6:48 PM, Christian Grün  wrote:
> 
>> Interestingly, proc:execute("/usr/local/bin/tesseract") works (I have BaseX 
>> 9.2).
> 
> How does the output look like?
> 
>> proc:system("/usr/local/bin/tesseract") returns the following:
> 
> If the code 1 is raised, it indicates that your command will be
> executed indeed, but it returns the exit code 1. Which output would
> you expect?
> 
> 
> 
> 
> 
>>> SET DEBUG true
>> DEBUG: true
>>> XQUERY proc:system("/usr/local/bin/tesseract")
>> org.basex.query.QueryException:
>> at org.basex.query.func.proc.ProcSystem.item(ProcSystem.java:26)
>> at org.basex.query.expr.ParseExpr.value(ParseExpr.java:50)
>> at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:45)
>> at org.basex.query.scope.MainModule.iter(MainModule.java:97)
>> at org.basex.query.QueryContext.iter(QueryContext.java:332)
>> at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90)
>> at org.basex.core.cmd.AQuery.query(AQuery.java:107)
>> at org.basex.core.cmd.XQuery.run(XQuery.java:22)
>> at org.basex.core.Command.run(Command.java:257)
>> at org.basex.core.Command.execute(Command.java:93)
>> at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
>> at org.basex.api.client.Session.execute(Session.java:36)
>> at org.basex.core.CLI.execute(CLI.java:92)
>> at org.basex.core.CLI.execute(CLI.java:76)
>> at org.basex.BaseX.console(BaseX.java:176)
>> at org.basex.BaseX.(BaseX.java:151)
>> at org.basex.BaseX.main(BaseX.java:42)
>> org.basex.core.BaseXException: Stopped at ., 1/12:
>> [proc:code0001]
>> at org.basex.core.Command.execute(Command.java:94)
>> at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
>> at org.basex.api.client.Session.execute(Session.java:36)
>> at org.basex.core.CLI.execute(CLI.java:92)
>> at org.basex.core.CLI.execute(CLI.java:76)
>> at org.basex.BaseX.console(BaseX.java:176)
>> at org.basex.BaseX.(BaseX.java:151)
>> at org.basex.BaseX.main(BaseX.java:42)
>> Caused by: org.basex.query.QueryException:
>> at org.basex.query.func.proc.ProcSystem.item(ProcSystem.java:26)
>> at org.basex.query.expr.ParseExpr.value(ParseExpr.java:50)
>> at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:45)
>> at org.basex.query.scope.MainModule.iter(MainModule.java:97)
>> at org.basex.query.QueryContext.iter(QueryContext.java:332)
>> at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90)
>> at org.basex.core.cmd.AQuery.query(AQuery.java:107)
>> at org.basex.core.cmd.XQuery.run(XQuery.java:22)
>> at org.basex.core.Command.run(Command.java:257)
>> at org.basex.core.Command.execute(Command.java:93)
>> ... 7 more
>> Stopped at ., 1/12:
>> [proc:code0001]
>> 
>> 
>> 
>> 
>> On Nov 13, 2019, at 5:50 PM, Christian Grün  
>> wrote:
>> 
>> Hi Giuseppe,
>> 
>> When I try to run
>> proc:system("/usr/local/bin/tesseract") I get the error [proc:code0001]
>> 
>> 
>> On my system, I get the (expected) error…
>> 
>> [proc:error] Cannot run program "/usr/local/bin/tesseract":
>> CreateProcess error=2, Das System kann die angegebene Datei nicht finden
>> 
>> …so we may need to find out what code 1 means in your case. Could you
>> run the query with debugging enabled and pass us on the stack trace?
>> 
>> And your error code indicates that you are using an older version of
>> BaseX. Does it work with a more recent version? If not, what do you
>> get?
>> 
>> Best,
>> Christian
>> 
>> 
>> 
>> 
>> 
>> Similarly:
>> 
>> proc:system("tesseract") returns [proc:error] Cannot run program 
>> "tesseract": error=2, No such file or directory
>> 
>> Similarly:
>> 
>> proc:system("tesseract", (), map {"dir" : "/usr/local/bin/"}) returns 
>> [proc:error] Cannot run program "tesseract" (in directory "/usr/local/bin"): 
>> error=2, No such file or directory
>> 
>> The command "tesseract" works at the command line. I suspect there may be a 
>> problem with permissions: is there a way to overcome this error? Thanks.
>> 

Re: [basex-talk] proc:system

2019-11-13 Thread Giuseppe G. A. Celano
Hi Christian,

Interestingly, proc:execute("/usr/local/bin/tesseract") works (I have BaseX 
9.2).

proc:system("/usr/local/bin/tesseract") returns the following:

> SET DEBUG true
DEBUG: true
> XQUERY proc:system("/usr/local/bin/tesseract")
org.basex.query.QueryException: 
at org.basex.query.func.proc.ProcSystem.item(ProcSystem.java:26)
at org.basex.query.expr.ParseExpr.value(ParseExpr.java:50)
at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:45)
at org.basex.query.scope.MainModule.iter(MainModule.java:97)
at org.basex.query.QueryContext.iter(QueryContext.java:332)
at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90)
at org.basex.core.cmd.AQuery.query(AQuery.java:107)
at org.basex.core.cmd.XQuery.run(XQuery.java:22)
at org.basex.core.Command.run(Command.java:257)
at org.basex.core.Command.execute(Command.java:93)
at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
at org.basex.api.client.Session.execute(Session.java:36)
at org.basex.core.CLI.execute(CLI.java:92)
at org.basex.core.CLI.execute(CLI.java:76)
at org.basex.BaseX.console(BaseX.java:176)
at org.basex.BaseX.(BaseX.java:151)
at org.basex.BaseX.main(BaseX.java:42)
org.basex.core.BaseXException: Stopped at ., 1/12:
[proc:code0001] 
at org.basex.core.Command.execute(Command.java:94)
at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
at org.basex.api.client.Session.execute(Session.java:36)
at org.basex.core.CLI.execute(CLI.java:92)
at org.basex.core.CLI.execute(CLI.java:76)
at org.basex.BaseX.console(BaseX.java:176)
at org.basex.BaseX.(BaseX.java:151)
at org.basex.BaseX.main(BaseX.java:42)
Caused by: org.basex.query.QueryException: 
at org.basex.query.func.proc.ProcSystem.item(ProcSystem.java:26)
at org.basex.query.expr.ParseExpr.value(ParseExpr.java:50)
at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:45)
at org.basex.query.scope.MainModule.iter(MainModule.java:97)
at org.basex.query.QueryContext.iter(QueryContext.java:332)
at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90)
at org.basex.core.cmd.AQuery.query(AQuery.java:107)
at org.basex.core.cmd.XQuery.run(XQuery.java:22)
at org.basex.core.Command.run(Command.java:257)
at org.basex.core.Command.execute(Command.java:93)
... 7 more
Stopped at ., 1/12:
[proc:code0001] 




> On Nov 13, 2019, at 5:50 PM, Christian Grün  wrote:
> 
> Hi Giuseppe,
> 
>> When I try to run
>> proc:system("/usr/local/bin/tesseract") I get the error [proc:code0001]
> 
> On my system, I get the (expected) error…
> 
> [proc:error] Cannot run program "/usr/local/bin/tesseract":
> CreateProcess error=2, Das System kann die angegebene Datei nicht finden
> 
> …so we may need to find out what code 1 means in your case. Could you
> run the query with debugging enabled and pass us on the stack trace?
> 
> And your error code indicates that you are using an older version of
> BaseX. Does it work with a more recent version? If not, what do you
> get?
> 
> Best,
> Christian
> 
> 
> 
> 
>> 
>> Similarly:
>> 
>> proc:system("tesseract") returns [proc:error] Cannot run program 
>> "tesseract": error=2, No such file or directory
>> 
>> Similarly:
>> 
>> proc:system("tesseract", (), map {"dir" : "/usr/local/bin/"}) returns 
>> [proc:error] Cannot run program "tesseract" (in directory "/usr/local/bin"): 
>> error=2, No such file or directory
>> 
>> The command "tesseract" works at the command line. I suspect there may be a 
>> problem with permissions: is there a way to overcome this error? Thanks.
>> 
>> Best,
>> Giuseppe
>> 
> 



[basex-talk] proc:system

2019-11-13 Thread Giuseppe G. A. Celano
Hi,

When I try to run 

proc:system("/usr/local/bin/tesseract") I get the error [proc:code0001] 

Similarly:

proc:system("tesseract") returns [proc:error] Cannot run program "tesseract": 
error=2, No such file or directory

Similarly:

proc:system("tesseract", (), map {"dir" : "/usr/local/bin/"}) returns 
[proc:error] Cannot run program "tesseract" (in directory "/usr/local/bin"): 
error=2, No such file or directory

The command "tesseract" works at the command line. I suspect there may be a 
problem with permissions: is there a way to overcome this error? Thanks.

Best,
Giuseppe



[basex-talk] HTTP request

2019-10-18 Thread Giuseppe G. A. Celano
Hi,

I have a curl request of the kind 

curl --data "something" http://something 

which I would like to perform using the HTTP module, but I cannot specify the 
data argument in the query. Any help?

The following does not work:

http:send-request(, 
'http://something' ,
"something"
)

Best,
Giuseppe

Re: [basex-talk] GUI, visualization panels

2019-09-10 Thread Giuseppe G. A. Celano
Hi, no, I just open the GUI and then an XML document. Thanks.

> On Sep 9, 2019, at 2:52 PM, Alexander Holupirek  wrote:
> 
> Hi Giuseppe,
> 
> just gave it a try using basex 9.2.4 with java version "1.8.0_121" on macOS 
> 10.14.6.
> I was not able to reproduce the issue so far.
> 
> One thing, did you open a database first? Visualization are only available on 
> (opened) databases (not in-memory documents, such as XML read from 
> filesystem, http, ...)
> 
> Kind regards,
>   Alex
> 
>> On 7. Sep 2019, at 15:56, Giuseppe G. A. Celano 
>>  wrote:
>> 
>> 1.8.0_171
>> 
>> Thanks.
>> 
>> 
>> 
>>> On Sep 7, 2019, at 1:43 PM, Alexander Holupirek  wrote:
>>> 
>>> Hi Giuseppe,
>>> 
>>> what java version do you use?
>>> 
>>> Kind regards,
>>> Alex
>>> 
>>>> Am 07.09.2019 um 11:42 schrieb Giuseppe G. A. Celano 
>>>> :
>>>> 
>>>> Hi,
>>>> 
>>>> Apparently, I cannot access the visualization panels in the GUI on my Mac 
>>>> (10.13.6) (they cannot be selected): am I missing anything? Thanks.
>>>> 
>>>> Best,
>>>> Giuseppe
>>> 
>> 
> 
> 



[basex-talk] GUI, visualization panels

2019-09-07 Thread Giuseppe G. A. Celano
Hi,

Apparently, I cannot access the visualization panels in the GUI on my Mac 
(10.13.6) (they cannot be selected): am I missing anything? Thanks.

Best,
Giuseppe

Re: [basex-talk] xs:string("<")

2019-08-25 Thread Giuseppe G. A. Celano
Thanks!

> On Aug 25, 2019, at 2:26 PM, Martin Honnen  wrote:
> 
> Am 25.08.2019 um 14:16 schrieb Giuseppe G. A. Celano:
>> 
>> 
>> I am wondering why xs:string("&") is not possible, but xs:string("<") is 
>> (although XML does not allow both ). Is there any reason? Thanks.
>> 
>> 
> 
> You are allowed to use some predefined entity references and to use character 
> references in string literals, see 
> https://www.w3.org/TR/xquery-31/#prod-xquery31-StringLiteral 
> <https://www.w3.org/TR/xquery-31/#prod-xquery31-StringLiteral> which defines 
> [222]   StringLiteral   ::=   ('"' (PredefinedEntityRef | CharRef 
> | EscapeQuot | [^"&])* '"') | ("'" (PredefinedEntityRef | CharRef | 
> EscapeApos | [^'&])* "'")
> 
> so you can use the ampersand as a meta character for e.g. `` and e.g. 
> ``, on the other hand, to use it literally, you need some escape 
> mechanism in the form of ``.
> 
> Inside XML the angle brackets are needed for markup but that is not the case 
> inside of a string literal so you can use them freely in there.



[basex-talk] xs:string("<")

2019-08-25 Thread Giuseppe G. A. Celano
Hi,

I am wondering why xs:string("&") is not possible, but xs:string("<") is 
(although XML does not allow both ). Is there any reason? Thanks.

Best,
Giuseppe 



Re: [basex-talk] Map serialization

2019-08-19 Thread Giuseppe G. A. Celano
Hi Christian,

I on the fly transformed some simple XML, which I edit on GitLab, into maps for 
join operations. I wanted to test performance with the maps being on my 
machine, and so I was looking for a quick solution to export them into a 
file/the GUI. I thought I could also run them with xquery:eval, but then I 
realized that problem about serialization of text nodes. Anyway, the conversion 
speeds up the lookups considerably, so I think I will adopt the on-the-fly 
transformation. Thanks for your answer!

Best,
Giuseppe

> On Aug 19, 2019, at 4:52 PM, Christian Grün  wrote:
> 
> Hi Giuseppe,
> 
> The exact rules for serializing maps can be looked up in the
> specification for XQuery Serialization 3.1 [1].
> 
> I can’t say too much about all the decisions that have been taken in
> the spec, but I remember that all decisions around the adaptive
> serialization method were, more or less inevitably, a compromise
> between finding a both easily accessible representation and one that
> keeps faith to specific data types. In real applications, it’s
> probably good not to serialize maps and arrays at all, and work with
> the contained data instead. If the output needs to be further
> processed, the JSON functions may be the ones to choose.
> 
> In which context do you work with serialized maps?
> 
> Cheers,
> Christian
> 
> [1] https://www.w3.org/TR/xslt-xquery-serialization-31/
> 
> 
> 
> On Mon, Aug 19, 2019 at 1:56 AM Giuseppe G. A. Celano
>  wrote:
>> 
>> Hi
>> 
>> When maps are serialized, the text nodes of an element (e.g., 
>> r/text()) are serialized without quotes and tests with "instance-of" 
>> show that they actually are text nodes: Couldn't they be serialized with 
>> quotes? I see that with data(), the text is serialized with quotes although 
>> it is xs:untypedAtomic. Is there a reason why the text node is kept as such 
>> and, for example, no casting/atomization happens when it becomes a value in 
>> a map? Thanks.
>> 
>> Best,
>> Giuseppe
>> 
> 



[basex-talk] Map serialization

2019-08-18 Thread Giuseppe G. A. Celano
Hi 

When maps are serialized, the text nodes of an element (e.g., r/text()) 
are serialized without quotes and tests with "instance-of" show that they 
actually are text nodes: Couldn't they be serialized with quotes? I see that 
with data(), the text is serialized with quotes although it is 
xs:untypedAtomic. Is there a reason why the text node is kept as such and, for 
example, no casting/atomization happens when it becomes a value in a map? 
Thanks.

Best,
Giuseppe



Re: [basex-talk] Binary module

2019-08-14 Thread Giuseppe G. A. Celano
Hi Christian,

Thanks! I missed that specification. Is there any reason why only the first 
octet is provided? More in general, I was interested to test how many 
bits/octects are used to represent an integer.


> On Aug 14, 2019, at 10:58 AM, Christian Grün  
> wrote:
> 
> bin:length(convert:integers-to-base64(1 to 1000)



Re: [basex-talk] Binary module

2019-08-14 Thread Giuseppe G. A. Celano
Thank you! This is very helpful. As to the bytes of an integer, I would assume 
that, since I always get one byte, its size is not 8 bits. In this case, I 
should also get values bigger than 255, but I cannot actually get that, no 
matter what the value of the integer is (e.g., 
convert:integers-to-base64(34777) => convert:binary-to-integers() ). I notice 
that the result for int 0 is 0 and for int 256 is also 0: it seems it alway 
outputs octet values, even if we need more than 1 octet.


> On Aug 14, 2019, at 9:54 AM, Michael Seiferle  wrote:
> 
> Hi Guiseppe, 
> 
> 1) You are right, it’s not too obvious what’s happening here: 
> In BaseX the default serialization mode is „basex“ which tries to return 
> items in a more readable way, hence internally your string is represented as 
> xs:base64Binary — but when it is output to the query results panel it will be 
> serialized according to whichever serialization is active; we opted for our 
> custom serialization as we considered it to be a sane default, as it is able 
> to serialize() all kinds of items (i.e. XML serialization won’t serialize 
> maps or arrays) and is generally a little more readable than adaptive because 
> we omit type information such as xs:base64Binary("c8Ogw6A=")
> 
> The following query serializes your value with different serialization 
> parameters. 
>> let $it := convert:string-to-base64("sàà")
>> 
>> for $method in (
>>   "basex", "xml", "adaptive", "json", "text","html"
>> )
>> return element { $method } {  
>>   serialize( 
>> $it,
>> map{ "method": $method }
>>   )
>> }
> 
> 
> 
> 2) The length method basically tells you how many bytes your string needs to 
> be encoded as utf8, you may use the following query and try it yourself:
>> let $str := "•"
>> let $bin := convert:string-to-base64($str)
>> return element _ {
>>   element str { $str },
>>   element b64 { $bin },
>>   element octets {
>>attribute length { bin:length($bin) },
>>for $byte in $bin => convert:binary-to-integers() 
>>return element octet {
>>  $byte => convert:integer-to-base(2)
>>}
>>   }  
>> }
> 
> Depending on the input string and it’s encoding (i.e. ‚a' will only need one 
> byte, but ‚ä' already needs two, ‚•' will even need three in utf8) your 
> string is converted to xs:base64Binary and the bin:length() function will 
> count how many octets are needed for this representation.
> 
> Hope this helps :-)
> 
> Best
> Michael
> 
> 
>> Am 14.08.2019 um 02:22 schrieb Giuseppe G. A. Celano 
>> mailto:cel...@informatik.uni-leipzig.de>>:
>> 
>> Hi
>> 
>> I am playing around with the binary module. I have two simple questions:
>> 
>> 1) convert:string-to-base64("sàà") returns sàà : what does it mean? I see in 
>> the documentation that I need to use string() to see the value of 
>> xs:base64Binary (c8Ogw6A=), but shouldn't xs:base64Binary already be 
>> outputted as c8Ogw6A= ? (This is actually displayed in the Info Panel).
>> 
>> 2) bin:length(convert:integers-to-base64(x)) always returns one number, no 
>> matter how big the number is. In the documentation I read that the output of 
>> bin:length should be the size of binary data in octects: how is that 
>> possible?
>> 
>> Thanks,
>> Giuseppe
>> 
>> 
> 



[basex-talk] Binary module

2019-08-13 Thread Giuseppe G. A. Celano
Hi

I am playing around with the binary module. I have two simple questions:

1) convert:string-to-base64("sàà") returns sàà : what does it mean? I see in 
the documentation that I need to use string() to see the value of 
xs:base64Binary (c8Ogw6A=), but shouldn't xs:base64Binary already be outputted 
as c8Ogw6A= ? (This is actually displayed in the Info Panel).

2) bin:length(convert:integers-to-base64(x)) always returns one number, no 
matter how big the number is. In the documentation I read that the output of 
bin:length should be the size of binary data in octects: how is that possible?

Thanks,
Giuseppe




Re: [basex-talk] Tumbling window

2019-04-03 Thread Giuseppe G. A. Celano
Thanks. I was close :)

Dr. Giuseppe G. A. Celano
DFG-project leader <http://gepris.dfg.de/gepris/projekt/408121292>
Universität Leipzig
Institute of Computer Science, NLP
Augustusplatz 10
Tel: +4934132223
04109 Leipzig
Deutschland

E-mail: cel...@informatik.uni-leipzig.de 
<mailto:cel...@informatik.uni-leipzig.de>
Web site 1: http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano 
<http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano> 
Web site 2: https://sites.google.com/site/giuseppegacelano/ 
<https://sites.google.com/site/giuseppegacelano/>

> On Apr 3, 2019, at 10:59 AM, Kirsten, Dirk  wrote:
> 
> for tumbling window $s in ("this", "is", "an", "example", "." , "this", "is", 
> "another", "[", "example", ".", "]", "Another", "example", ".")
> start  $a when fn:true() 
> end $b previous $prev next $r when ($b = "]" and $prev = ".") or ($b = "." 
> and $r != "]")
> return
> {$s}



[basex-talk] Tumbling window

2019-04-03 Thread Giuseppe G. A. Celano
I have written the following code:

for tumbling window $s in ("this", "is", "an", "example", "." , "this", "is", 
"another", "[", "example", ".", "]", "Another", "example", ".")
start  $a when fn:true() 
end $b next $r when ($b = "." and $r = "]") or $b = "."
return
{$s}

which returns:

this is an example .
this is another [ example .
] Another example .

but I am looking for 

this is an example .
this is another [ example . ]
Another example .

I am playing around with if-clauses, but I am wondering whether there is a more 
direct way to force the end of a sentence at "]" when it is preceded by "." or 
at "." when there is no following "]". Thanks.

Ciao,
Giuseppe



Re: [basex-talk] Write functions in sequence

2019-03-12 Thread Giuseppe G. A. Celano
The code is long, because there are many functions. However, I did something 
like the following, and it works:

let $o :=  file:create-dir("/Users/mycomputer/prova")
let $o3 := file:write("/Users/mycomputer/prova/file2.xml", 
 (file:write("/Users/mycomputer/prova/file1.xml", "ciao"), 
file:read-text("/Users/mycomputer/prova/file1.xml"))   
  )

return
($o, $o3)

The evaluation of the first write() is forced to happen and conclude before the 
second write() is evaluated. I am still experimenting on this, but would there 
be a way to force completion of evaluation without embedding?



> On Mar 12, 2019, at 11:44 AM, Giuseppe G. A. Celano 
>  wrote:
> 
> let $o :=  file:create-dir("/Users/mycomputer/prova")
> let $o2 := file:write("/Users/mycomputer/prova/file1.xml", "ciao")
> let $o3 := file:write("/Users/mycomputer/prova/file2.xml", 
> file:read-text("/Users/mycomputer/prova/file1.xml"))
> return
> ($o, $o2, $o3)
> 
> This actually works. In my real example the writing of $o2 requires e few 
> seconds. It might be that $o3 is evaluated while $o2 is still running?
> 
> Dr. Giuseppe G. A. Celano
> DFG-project leader <http://gepris.dfg.de/gepris/projekt/408121292>
> Universität Leipzig
> Institute of Computer Science, NLP
> Augustusplatz 10
> Tel: +4934132223
> 04109 Leipzig
> Deutschland
> 
> E-mail: cel...@informatik.uni-leipzig.de 
> <mailto:cel...@informatik.uni-leipzig.de>
> Web site 1: http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano 
> <http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano> 
> Web site 2: https://sites.google.com/site/giuseppegacelano/ 
> <https://sites.google.com/site/giuseppegacelano/>
> 
>> On Mar 12, 2019, at 11:27 AM, Michael Seiferle > <mailto:m...@basex.org>> wrote:
>> 
>> Hi Guiseppe, 
>> 
>> The following pattern is supposed to / does work:
>>> file:write("1.txt", "Written to 1.txt"),
>>> file:write("2.txt", file:read-text("1.txt")),
>>> "Read from 2.txt: " || file:read-text('2.txt')
>> 
>> 
>> Could you maybe elaborate a bit more on your code?
>> 
>> Best from Konstanz
>> 
>> Michael 
>> 
>>> Am 12.03.2019 um 11:19 schrieb Giuseppe G. A. Celano 
>>> >> <mailto:cel...@informatik.uni-leipzig.de>>:
>>> 
>>> Hi
>>>  
>>> I wrote a single script which should do: write a file -> open this file -> 
>>> write another different file. I put the write expressions in the right 
>>> sequence, but it seems that the second one cannot happen because the file 
>>> created by the first write function has not yet been created at the time 
>>> the second function is invoked. Does anyone have a suggestion about this? 
>>> Thanks.
>>> 
>>> Best,
>>> Giuseppe
>> 
> 



Re: [basex-talk] Write functions in sequence

2019-03-12 Thread Giuseppe G. A. Celano
let $o :=  file:create-dir("/Users/mycomputer/prova")
let $o2 := file:write("/Users/mycomputer/prova/file1.xml", "ciao")
let $o3 := file:write("/Users/mycomputer/prova/file2.xml", 
file:read-text("/Users/mycomputer/prova/file1.xml"))
return
($o, $o2, $o3)

This actually works. In my real example the writing of $o2 requires e few 
seconds. It might be that $o3 is evaluated while $o2 is still running?

Dr. Giuseppe G. A. Celano
DFG-project leader <http://gepris.dfg.de/gepris/projekt/408121292>
Universität Leipzig
Institute of Computer Science, NLP
Augustusplatz 10
Tel: +4934132223
04109 Leipzig
Deutschland

E-mail: cel...@informatik.uni-leipzig.de 
<mailto:cel...@informatik.uni-leipzig.de>
Web site 1: http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano 
<http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano> 
Web site 2: https://sites.google.com/site/giuseppegacelano/ 
<https://sites.google.com/site/giuseppegacelano/>

> On Mar 12, 2019, at 11:27 AM, Michael Seiferle  wrote:
> 
> Hi Guiseppe, 
> 
> The following pattern is supposed to / does work:
>> file:write("1.txt", "Written to 1.txt"),
>> file:write("2.txt", file:read-text("1.txt")),
>> "Read from 2.txt: " || file:read-text('2.txt')
> 
> 
> Could you maybe elaborate a bit more on your code?
> 
> Best from Konstanz
> 
> Michael 
> 
>> Am 12.03.2019 um 11:19 schrieb Giuseppe G. A. Celano 
>> mailto:cel...@informatik.uni-leipzig.de>>:
>> 
>> Hi
>>  
>> I wrote a single script which should do: write a file -> open this file -> 
>> write another different file. I put the write expressions in the right 
>> sequence, but it seems that the second one cannot happen because the file 
>> created by the first write function has not yet been created at the time the 
>> second function is invoked. Does anyone have a suggestion about this? Thanks.
>> 
>> Best,
>> Giuseppe
> 



[basex-talk] Write functions in sequence

2019-03-12 Thread Giuseppe G. A. Celano
Hi
 
I wrote a single script which should do: write a file -> open this file -> 
write another different file. I put the write expressions in the right 
sequence, but it seems that the second one cannot happen because the file 
created by the first write function has not yet been created at the time the 
second function is invoked. Does anyone have a suggestion about this? Thanks.

Best,
Giuseppe

Re: [basex-talk] Stack Overflow: Try tail recursion

2019-02-20 Thread Giuseppe G. A. Celano
Thanks! This works perfectly


> On Feb 19, 2019, at 6:08 PM, Bridger Dyson-Smith  
> wrote:
> 
> Hi Michael - 
> that's a very nice solution! Thanks for sharing!
> 
> On Tue, Feb 19, 2019 at 6:14 AM Michael Seiferle  <mailto:m...@basex.org>> wrote:
> Hi Guiseppe, 
> 
> 
> unfortunately the example won’t be tail-call optimized as your last statement 
> is not the function call itself but a sequence construction that happens to 
> contain the function call.
> 
> Your function must be of the form:
> 
>> f(x) => s(x)if your termination condition is met (i.e. no more books)
>> f(x) => f(g(x))
> 
> Your function is defined (more or less) as something like:
>> f(x) => s(x)if your termination condition is met (i.e. no more books)
>> f(x) =>  something  +  f(g(x))
> 
> So you have to get that something (i.e. the sequence concatenation) part 
> inside your g-function, here’s what I think you might want:
> 
> Cross-posted as gist for better readability 
> https://gist.github.com/micheee/ef75c9f30449c2de3406182ff2fdce50 
> <https://gist.github.com/micheee/ef75c9f30449c2de3406182ff2fdce50> 
>>   declare variable $bookstore := {
>> for $i in 1 to 30
>> return element book {
>>   element name {
>> "Book " || $i
>>   },
>>   element price {
>> 1
>>   },
>>   element author {
>> "Author "|| $i
>>   }
>> }
>>   }
>>   ;
>>   
>>   (:~ 
>>   Rolling total of prices
>>   @param $prices, accumulates the book prices.
>>  $prices[$n] contains the sum of prices for $book[position() <= $n]
>>   @param $books sequence of books not yet added to $prices
>>   :)
>>   declare function local:sum($prices, $books)
>>   {
>>   let $book   := head($books)   (: Get first book in sequence :)
>>   let $prices := if(count($prices) = 0) (: if empty, initialize prices 
>> with first price :)
>> then ($book/price )   
>> else ((: add the current rolling total to the 
>> list of rolling totals :)
>>   (: that’s the concatenation part :)
>>   $prices,
>>   element price { $prices[count($prices)] + $book/price } 
>> )  
>>   return (
>> if(count($books) > 1) then 
>>   local:sum($prices, tail($books))  (: here's the tail call :)
>>     else $prices
>>   )
>>   };
>>   
>>   
>>   {
>>   local:sum((), $bookstore/book) 
>>   }
>>   
> 
> You can check if your query is tail-call optimized using the query info 
> panel: 
>> - mark as tail call: local:sum(if(empty($prices_2)) then (hea...
> 
> 
> 
> Hope this helps!
> 
> Best from Konstanz
> 
> Michael
> 
>> Am 18.02.2019 um 19:12 schrieb Giuseppe G. A. Celano 
>> mailto:cel...@informatik.uni-leipzig.de>>:
>> 
>> Hi Jonathan, 
>> 
>> Thanks for that. However, it returns the same stack overflow error as the 
>> other script, when  are 38000.  Encreasing the JVM size does not help 
>> either.
>> 
>> E-mail: cel...@informatik.uni-leipzig.de 
>> <mailto:cel...@informatik.uni-leipzig.de>
>> Web site 1: http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano 
>> <http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano> 
>> Web site 2: https://sites.google.com/site/giuseppegacelano/ 
>> <https://sites.google.com/site/giuseppegacelano/>
>> 
>>> On Feb 18, 2019, at 4:51 PM, Jonathan Robie >> <mailto:jonathan.ro...@gmail.com>> wrote:
>>> 
>>> To make it tail-recursive, make the recursive call the last operation in 
>>> the function.
>>> 
>>> https://en.wikipedia.org/wiki/Tail_call 
>>> <https://en.wikipedia.org/wiki/Tail_call>
>>> 
>>> The else() clause is what keeps it from being tail recursive.  Something 
>>> like this should work:
>>> 
>>> declare variable $bookstore := 
>>>   
>>> story
>>> 50.00
>>> smith
>>>   
>>>   
>>> history
>>> 150.00
>>> kelly
>>>   
>>>   
>>> epic
>>> 300.00
>>> jones
>>>   
>>> ;
>>> 
>>> declare function local:sum($books, $sum)
>>> {
>>> let $sum :=  $sum + $books[1]/price
>>> return (
>>> { $sum },
>>> $books[2] ! local:sum(tail($books), $sum)
>>> )
>>> };
>>> 
>>> 
>>> {
>>> local:sum($bookstore/book, 0)
>>> }
>>> 
>>> 
>>> 
>>> Jonathan
>>> 
>>> On Mon, Feb 18, 2019 at 10:24 AM Giuseppe G. A. Celano 
>>> >> <mailto:cel...@informatik.uni-leipzig.de>> wrote:
>>> I am writing a recursive function which is similar to the one here:
>>> 
>>> https://stackoverflow.com/questions/27702718/to-add-values-in-cumulative-format
>>>  
>>> <https://stackoverflow.com/questions/27702718/to-add-values-in-cumulative-format>
>>> 
>>> Interestingly, local:sum() works if there are not many . However 
>>> with 38000 book element I get the error "Stack Overflow: Try tail 
>>> recursion".
>>> 
>>> Any idea?
>>> 
>>> Ciao,
>>> Giuseppe
>>> 
>>> 
>> 
> 



Re: [basex-talk] Stack Overflow: Try tail recursion

2019-02-18 Thread Giuseppe G. A. Celano
Hi Jonathan, 

Thanks for that. However, it returns the same stack overflow error as the other 
script, when  are 38000.  Encreasing the JVM size does not help either.

E-mail: cel...@informatik.uni-leipzig.de 
<mailto:cel...@informatik.uni-leipzig.de>
Web site 1: http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano 
<http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano> 
Web site 2: https://sites.google.com/site/giuseppegacelano/ 
<https://sites.google.com/site/giuseppegacelano/>

> On Feb 18, 2019, at 4:51 PM, Jonathan Robie  wrote:
> 
> To make it tail-recursive, make the recursive call the last operation in the 
> function.
> 
> https://en.wikipedia.org/wiki/Tail_call 
> <https://en.wikipedia.org/wiki/Tail_call>
> 
> The else() clause is what keeps it from being tail recursive.  Something like 
> this should work:
> 
> declare variable $bookstore := 
>   
> story
> 50.00
> smith
>   
>   
> history
> 150.00
> kelly
>   
>   
> epic
> 300.00
> jones
>   
> ;
> 
> declare function local:sum($books, $sum)
> {
> let $sum :=  $sum + $books[1]/price
> return (
> { $sum },
> $books[2] ! local:sum(tail($books), $sum)
>     )
> };
> 
> 
> {
> local:sum($bookstore/book, 0)
> }
> 
> 
> 
> Jonathan
> 
> On Mon, Feb 18, 2019 at 10:24 AM Giuseppe G. A. Celano 
> mailto:cel...@informatik.uni-leipzig.de>> 
> wrote:
> I am writing a recursive function which is similar to the one here:
> 
> https://stackoverflow.com/questions/27702718/to-add-values-in-cumulative-format
>  
> <https://stackoverflow.com/questions/27702718/to-add-values-in-cumulative-format>
> 
> Interestingly, local:sum() works if there are not many . However with 
> 38000 book element I get the error "Stack Overflow: Try tail recursion".
> 
> Any idea?
> 
> Ciao,
> Giuseppe
> 
> 



[basex-talk] Stack Overflow: Try tail recursion

2019-02-18 Thread Giuseppe G. A. Celano
I am writing a recursive function which is similar to the one here:

https://stackoverflow.com/questions/27702718/to-add-values-in-cumulative-format

Interestingly, local:sum() works if there are not many . However with 
38000 book element I get the error "Stack Overflow: Try tail recursion".

Any idea?

Ciao,
Giuseppe




[basex-talk] If clause

2019-02-18 Thread Giuseppe G. A. Celano
Hi,

I see that in BaseX 9.1.2 an expression such as "if (3) then 4 " does not raise 
an error, even if the "else" part is missing. Is this correct?

Ciao,
Giuseppe



Re: [basex-talk] proc:execute

2019-02-14 Thread Giuseppe G. A. Celano
Hi Christian,

The problem with Mac is that it is not that easy to chance root settings. 
However, I have found a workaround, which may be useful to others: instead of 
invoking something like proc:system("python", "main.py"), I create a bash file 
like:

#!/bin/bash

export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8
export LANGUAGE=en_US.UTF-8
absolute-path/python absolute-path/main.py

and then I run in Basex proc:system("bash", "path-to-the-bash-file"). This 
works!

Ciao,
Giuseppe

Dr. Giuseppe G. A. Celano
DFG-project leader <http://gepris.dfg.de/gepris/projekt/408121292>
Universität Leipzig
Institute of Computer Science, NLP
Augustusplatz 10
Tel: +4934132223
04109 Leipzig
Deutschland

E-mail: cel...@informatik.uni-leipzig.de 
<mailto:cel...@informatik.uni-leipzig.de>
Web site 1: http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano 
<http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano> 
Web site 2: https://sites.google.com/site/giuseppegacelano/ 
<https://sites.google.com/site/giuseppegacelano/>

> On Feb 13, 2019, at 5:16 PM, Christian Grün  wrote:
> 
> Hi Giuseppe,
> 
> Have you tried to set your locale variable on your system? If it’s
> Ubuntu, you could have a look here:
> 
> https://ubuntuforums.org/showthread.php?t=2212353
> 
> Hope this helps,
> Christian
> 
> 
> On Tue, Feb 12, 2019 at 4:13 PM Giuseppe G. A. Celano
>  wrote:
>> 
>> I notice that if I run "locale" from my MAC Terminal I get the correct one 
>> (utf-8), but if I run proc:system("locale") I get:
>> 
>> LANG=
>> LC_COLLATE="C"
>> LC_CTYPE="C"
>> LC_MESSAGES="C"
>> LC_MONETARY="C"
>> LC_NUMERIC="C"
>> LC_TIME="C"
>> LC_ALL=
>> 
>> Is there a way to force BaseX to start with utf-8? Thanks.
>> 
>> Ciao,
>> Giuseppe
>> 
>> 
>> 
>> On Feb 12, 2019, at 9:12 AM, Giuseppe G. A. Celano 
>>  wrote:
>> 
>> Hi Liam
>> 
>> Thanks. My locale is actually "en_US.UTF-8", so I do not know why the error 
>> is raised
>> 
>> On Feb 12, 2019, at 6:30 AM, Liam R. E. Quin  wrote:
>> 
>> On Tue, 2019-02-12 at 01:42 +0100, Giuseppe G. A. Celano wrote:
>> 
>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position
>> 5390: ordinal not in range(128)
>> 
>> 
>> A guess - sounds like an encoding error - 0xE2 is â in Unicode, and 128
>> suggests US ASCII was expected - check the encoding declaration on the
>> XML, or maybe it's a locale difference?
>> 
>> --
>> Liam Quin, https://www.delightfulcomputing.com/
>> Available for XML/Document/Information Architecture/XSLT/
>> XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
>> Web slave for vintage clipart http://www.fromoldbooks.org/
>> 
>> 
>> 
> 



Re: [basex-talk] proc:execute

2019-02-12 Thread Giuseppe G. A. Celano
I notice that if I run "locale" from my MAC Terminal I get the correct one 
(utf-8), but if I run proc:system("locale") I get:

LANG=
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=

Is there a way to force BaseX to start with utf-8? Thanks.

Ciao,
Giuseppe



> On Feb 12, 2019, at 9:12 AM, Giuseppe G. A. Celano 
>  wrote:
> 
> Hi Liam
> 
> Thanks. My locale is actually "en_US.UTF-8", so I do not know why the error 
> is raised 
> 
>> On Feb 12, 2019, at 6:30 AM, Liam R. E. Quin  wrote:
>> 
>> On Tue, 2019-02-12 at 01:42 +0100, Giuseppe G. A. Celano wrote:
>>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position
>>> 5390: ordinal not in range(128)
>> 
>> A guess - sounds like an encoding error - 0xE2 is â in Unicode, and 128
>> suggests US ASCII was expected - check the encoding declaration on the
>> XML, or maybe it's a locale difference?
>> 
>> -- 
>> Liam Quin, https://www.delightfulcomputing.com/
>> Available for XML/Document/Information Architecture/XSLT/
>> XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
>> Web slave for vintage clipart http://www.fromoldbooks.org/
>> 
> 



Re: [basex-talk] proc:execute

2019-02-12 Thread Giuseppe G. A. Celano
Hi Liam

Thanks. My locale is actually "en_US.UTF-8", so I do not know why the error is 
raised 

> On Feb 12, 2019, at 6:30 AM, Liam R. E. Quin  wrote:
> 
> On Tue, 2019-02-12 at 01:42 +0100, Giuseppe G. A. Celano wrote:
>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position
>> 5390: ordinal not in range(128)
> 
> A guess - sounds like an encoding error - 0xE2 is â in Unicode, and 128
> suggests US ASCII was expected - check the encoding declaration on the
> XML, or maybe it's a locale difference?
> 
> -- 
> Liam Quin, https://www.delightfulcomputing.com/
> Available for XML/Document/Information Architecture/XSLT/
> XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
> Web slave for vintage clipart http://www.fromoldbooks.org/
> 



[basex-talk] proc:execute

2019-02-11 Thread Giuseppe G. A. Celano
Hi,

I am trying to run a parser via proc:execute/system. While if I run it from the 
command line, it works, it does not, if I run it via BaseX. More precisely, I 
get the following error:

Traceback (most recent call last):
  File "main.py", line 325, in module
test_data = loader.load(params.test)
  File "/Users/mycomputer/Desktop/Basex9.1beta 2/utils.py", line 57, in load
for line in f:
  File "/Users/mycomputer/anaconda/lib/python3.6/encodings/ascii.py", line 26, 
in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 5390: 
ordinal not in range(128) 

Is there a way to avoid that? Thanks.

Best,
Giuseppe

Re: [basex-talk] BaseX 9.1.1

2018-12-17 Thread Giuseppe G. A. Celano
Awesome! Thank you so much for making BaseX better and better!

Ciao,
Giuseppe

Dr. Giuseppe G. A. Celano
DFG-project leader <http://gepris.dfg.de/gepris/projekt/408121292>
Universität Leipzig
Institute of Computer Science, NLP
Augustusplatz 10
Tel: +4934132223
04109 Leipzig
Deutschland

E-mail: cel...@informatik.uni-leipzig.de 
<mailto:cel...@informatik.uni-leipzig.de>
Web site 1: http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano 
<http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano> 
Web site 2: https://sites.google.com/site/giuseppegacelano/ 
<https://sites.google.com/site/giuseppegacelano/>

> On Dec 14, 2018, at 11:31 AM, Christian Grün  
> wrote:
> 
> Hi all,
> 
> we are glad to provide you with version 9.1.1 of BaseX! It’s actually
> more than just a maintenance release:
> 
> XQuery:
> - Comprehensive rewritings of positional predicates and functions
> - Higher-order functions: Improved type inference
> - Improved rewriting of context-based and/or nested predicates
> 
> Java Bindings:
> - Faster access to and evaluation of Java functions and variables
> - Improved pre-selection of function candidates at compile time
> - Better error messages (incl. function arity and similar names)
> 
> DBA:
> - Settings: user-defined pattern for ignoring log entries
> - Login: pass on URL query strings
> 
> Minor improvements:
> - Import: detect epub files as ZIP archives
> - Digest Authentication: No delay after first request
> - GUI, Preferences: user-defined choice of XML suffixes
> 
> The new version is online: http://basex.org/
> 
> Looking forward to your feedback as usual.
> Have a nice time,
> Christian
> 



Re: [basex-talk] restxq

2018-11-11 Thread Giuseppe G. A. Celano
Hi Andy,

Thank you so much. Thanks to your suggestion, I have been able to detect the 
right directory. I made a copy of my basex folder from the Desktop into another 
folder, and strangely the webpath of the basex folder in the Desktop was 
referred to even if I used the basexhttp command of the other basex folder.

Ciao,
Giuseppe


> On Nov 9, 2018, at 6:21 PM, Andy Bunce  wrote:
> 
> Hi Giuseppe,
> 
> You can use the dba app to see what settings are in use. Goto url 
> /dba/settings and check WEBPATH.
> 
> /Andy
> 
> 
> On Fri, 9 Nov 2018 at 13:10, Giuseppe G. A. Celano 
> mailto:cel...@informatik.uni-leipzig.de>> 
> wrote:
> Hi Marco,
> 
> I think you are right, but I cannot find the webapp folder which is used when 
> I launch basexhttp. I am sure the basexhttp I launch is the one I want to 
> launch, but the webapp folder it checks for .xqm files is not the one within 
> the "basex" folder containing both of them.
> 
> 
>> On Nov 9, 2018, at 1:34 PM, Marco Lettere > <mailto:m.lett...@gmail.com>> wrote:
>> 
>> It looks like your basexhttp server is pointing to a different directory 
>> than the one you expect.
>> This might happen from time to time depending on type of installation env 
>> variables and things like that.
>> M.
>> 
>> On 09/11/18 13:30, Christian Grün wrote:
>>> …difficult to tell. Could you please provide us with a minimized
>>> version and tell us the exact steps how to proceed (1. download
>>> basex91.zip, 2. unzip, etc.)?
>>> On Fri, Nov 9, 2018 at 1:03 PM Giuseppe G. A. Celano
>>> >> <mailto:cel...@informatik.uni-leipzig.de>> wrote:
>>>> I am trying to make a RESTXQ webservice I created with BaseX 8.3 available 
>>>> in Basex 9.2. I simply copied my "file.xqm" into the webapp folder but, 
>>>> when I type the path of a function contained in it, it does not work ("No 
>>>> function found that matches the request."). Strangely enough, when I 
>>>> minimally modify restxq.xqm, the modifications do not apply either. Any 
>>>> idea about what the problem might be? Thanks.
>> 
>> 
> 



Re: [basex-talk] restxq

2018-11-09 Thread Giuseppe G. A. Celano
Hi Marco,

I think you are right, but I cannot find the webapp folder which is used when I 
launch basexhttp. I am sure the basexhttp I launch is the one I want to launch, 
but the webapp folder it checks for .xqm files is not the one within the 
"basex" folder containing both of them.


> On Nov 9, 2018, at 1:34 PM, Marco Lettere  wrote:
> 
> It looks like your basexhttp server is pointing to a different directory than 
> the one you expect.
> This might happen from time to time depending on type of installation env 
> variables and things like that.
> M.
> 
> On 09/11/18 13:30, Christian Grün wrote:
>> …difficult to tell. Could you please provide us with a minimized
>> version and tell us the exact steps how to proceed (1. download
>> basex91.zip, 2. unzip, etc.)?
>> On Fri, Nov 9, 2018 at 1:03 PM Giuseppe G. A. Celano
>>  wrote:
>>> I am trying to make a RESTXQ webservice I created with BaseX 8.3 available 
>>> in Basex 9.2. I simply copied my "file.xqm" into the webapp folder but, 
>>> when I type the path of a function contained in it, it does not work ("No 
>>> function found that matches the request."). Strangely enough, when I 
>>> minimally modify restxq.xqm, the modifications do not apply either. Any 
>>> idea about what the problem might be? Thanks.
> 
> 



[basex-talk] restxq

2018-11-09 Thread Giuseppe G. A. Celano
I am trying to make a RESTXQ webservice I created with BaseX 8.3 available in 
Basex 9.2. I simply copied my "file.xqm" into the webapp folder but, when I 
type the path of a function contained in it, it does not work ("No function 
found that matches the request."). Strangely enough, when I minimally modify 
restxq.xqm, the modifications do not apply either. Any idea about what the 
problem might be? Thanks.