Re: [basex-talk] http:send-request - problem with response

2019-11-13 Thread Bogdan Bogucki

Hi Christian,

It works. Thank you for quick fix.

Best Regards

Bogdan Bogucki

W dniu 11/13/2019 o 17:42, Christian Grün pisze:

Hi Bogdan (cc to the list),

Thanks for digging deeper. I noticed that the standard Java function
that we used returned only one value per header field, and dropped the
others [1].

I managed to fix this in the latest stable snapshot; could you give it
a try [2]?

BaseX 9.3 will be released end of November.

Cheers,
Christian

[1] http://files.basex.org/releases/latest/
[2] https://github.com/BaseXdb/basex/issues/1751




On Wed, Nov 13, 2019 at 4:26 PM Bogdan Bogucki  wrote:

Hi Christian,
Lack of many redirection is not a big problem. I can handle it manually.
Problem is not complete response header form first request.

Browser make 3 redirection I am talking about first.

Please take a look.


Response header from browser (first request) - contains three fields Set
Cookies:

set-cookie __cfduid=daf54110b2a87d66c2a53…; domain=.pracuj.pl; HttpOnly

set-cookie _yaic=13; expires=Fri, 31-Dec- 23:59:59 GMT; path=/

set-cookie _urnadiam=A; domain=.pracuj.pl…3:00:00 GMT; path=/; HttpOnly

but response header form http:send-request contains only last one:



I need information from previous set-cookie fields to make manual
redirection passing information about sessions to next requests.

Regards

Bogdan Bogucki

W dniu 29.10.2019 o 03:06, Christian Grün pisze:

Hi Bogdan,

The current http:send-request implementation is based on the default
Java HttpURLConnection, which does not resolve redirects that use
different protocols [1]. This is the reason why your request will not
be fully processed (as it e.g. happens when you use the browser).

It seems that your initial request to the https protocol returns a 302
redirect to a (now unsafe) http URL, which returns another redirect to
https. I don’t know who maintains the discussed web site, but it could
be worth contacting the admins and asking them if they could update
and simplify their redirect policy.

Hope this helps,
Christian

[1] https://stackoverflow.com/a/1884427/1018790




On Fri, Oct 25, 2019 at 1:52 PM Bogdan Bogucki  wrote:

Hello,

I encounter problem with http:send-request function. I need handle multiple 
forward requests (3) with cookies.

First request returns cookies information which are required with next 
requests. Output from http:send-request dosen't contains all fields which are 
returned by server.

Response from browser is flowing:
cache-control
private
cf-cache-status
DYNAMIC
cf-ray
52b3cedc5b9bcc9f-WAW
content-type
text/html; charset=utf-8
date
Fri, 25 Oct 2019 11:21:37 GMT
expect-ct
max-age=604800, report-uri="ht….com/cdn-cgi/beacon/expect-ct"
location
http://www.pracuj.pl/praca/jav…eloper-warszawa,oferta,7171988
server
cloudflare
set-cookie
__cfduid=daf54110b2a87d66c2a53…; domain=.pracuj.pl; HttpOnly
set-cookie
_yaic=13; expires=Fri, 31-Dec- 23:59:59 GMT; path=/
set-cookie
_urnadiam=A; domain=.pracuj.pl…3:00:00 GMT; path=/; HttpOnly
x-aspnet-version
4.0.30319
x-aspnetmvc-version
5.2
X-Firefox-Spdy
h2
x-powered-by
ASP.NET
x-ua-compatible
IE=edge


Request:

http:send-request(
  







  , 
'https://www.pracuj.pl/praca/java-developer-warszawa,oferta,7171988')

Response form http:send-request is:

http://expath.org/ns/http-client; status="302" 
message="Found">











https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"/>





Two set-cookie fields is missing.

How should I invoke http:send-request to receive proper result  ?

Regards

Bogdan


Re: [basex-talk] proc:system

2019-11-13 Thread Christian Grün
Good to see, everything’s alright then. proc:system will only output a
result if the command was executed successfully.


Giuseppe G. A. Celano  schrieb am Mi.,
13. Nov. 2019, 19:27:

> I would expect just some text, as with proc:system("ls")
>
> 
>   Usage:
>   /usr/local/bin/tesseract --help | --help-extra | --version
>   /usr/local/bin/tesseract --list-langs
>   /usr/local/bin/tesseract imagename outputbase [options...]
> [configfile...]
>
> OCR options:
>   -l LANG[+LANG]Specify language(s) used for OCR.
> NOTE: These options must occur before any configfile.
>
> Single options:
>   --helpShow this help message.
>   --help-extra  Show extra help for advanced users.
>   --version Show version information.
>   --list-langs  List available languages for tesseract engine.
> 
>   1
> 
>
> I guess that the "1" code blocks the printing. If I use
> proc:system("/usr/local/bin/tesseract", "--help"), it works.
>
> E-mail: cel...@informatik.uni-leipzig.de
> Web site 1: http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano
> Web site 2: https://sites.google.com/site/giuseppegacelano/
>
> On Nov 13, 2019, at 6:48 PM, Christian Grün 
> wrote:
>
> Interestingly, proc:execute("/usr/local/bin/tesseract") works (I have
> BaseX 9.2).
>
>
> How does the output look like?
>
> proc:system("/usr/local/bin/tesseract") returns the following:
>
>
> If the code 1 is raised, it indicates that your command will be
> executed indeed, but it returns the exit code 1. Which output would
> you expect?
>
>
>
>
>
> SET DEBUG true
>
> DEBUG: true
>
> XQUERY proc:system("/usr/local/bin/tesseract")
>
> org.basex.query.QueryException:
> at org.basex.query.func.proc.ProcSystem.item(ProcSystem.java:26)
> at org.basex.query.expr.ParseExpr.value(ParseExpr.java:50)
> at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:45)
> at org.basex.query.scope.MainModule.iter(MainModule.java:97)
> at org.basex.query.QueryContext.iter(QueryContext.java:332)
> at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90)
> at org.basex.core.cmd.AQuery.query(AQuery.java:107)
> at org.basex.core.cmd.XQuery.run(XQuery.java:22)
> at org.basex.core.Command.run(Command.java:257)
> at org.basex.core.Command.execute(Command.java:93)
> at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
> at org.basex.api.client.Session.execute(Session.java:36)
> at org.basex.core.CLI.execute(CLI.java:92)
> at org.basex.core.CLI.execute(CLI.java:76)
> at org.basex.BaseX.console(BaseX.java:176)
> at org.basex.BaseX.(BaseX.java:151)
> at org.basex.BaseX.main(BaseX.java:42)
> org.basex.core.BaseXException: Stopped at ., 1/12:
> [proc:code0001]
> at org.basex.core.Command.execute(Command.java:94)
> at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
> at org.basex.api.client.Session.execute(Session.java:36)
> at org.basex.core.CLI.execute(CLI.java:92)
> at org.basex.core.CLI.execute(CLI.java:76)
> at org.basex.BaseX.console(BaseX.java:176)
> at org.basex.BaseX.(BaseX.java:151)
> at org.basex.BaseX.main(BaseX.java:42)
> Caused by: org.basex.query.QueryException:
> at org.basex.query.func.proc.ProcSystem.item(ProcSystem.java:26)
> at org.basex.query.expr.ParseExpr.value(ParseExpr.java:50)
> at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:45)
> at org.basex.query.scope.MainModule.iter(MainModule.java:97)
> at org.basex.query.QueryContext.iter(QueryContext.java:332)
> at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90)
> at org.basex.core.cmd.AQuery.query(AQuery.java:107)
> at org.basex.core.cmd.XQuery.run(XQuery.java:22)
> at org.basex.core.Command.run(Command.java:257)
> at org.basex.core.Command.execute(Command.java:93)
> ... 7 more
> Stopped at ., 1/12:
> [proc:code0001]
>
>
>
>
> On Nov 13, 2019, at 5:50 PM, Christian Grün 
> wrote:
>
> Hi Giuseppe,
>
> When I try to run
> proc:system("/usr/local/bin/tesseract") I get the error [proc:code0001]
>
>
> On my system, I get the (expected) error…
>
> [proc:error] Cannot run program "/usr/local/bin/tesseract":
> CreateProcess error=2, Das System kann die angegebene Datei nicht finden
>
> …so we may need to find out what code 1 means in your case. Could you
> run the query with debugging enabled and pass us on the stack trace?
>
> And your error code indicates that you are using an older version of
> BaseX. Does it work with a more recent version? If not, what do you
> get?
>
> Best,
> Christian
>
>
>
>
>
> Similarly:
>
> proc:system("tesseract") returns [proc:error] Cannot run program
> "tesseract": error=2, No such file or directory
>
> Similarly:
>
> proc:system("tesseract", (), map {"dir" : "/usr/local/bin/"}) returns
> [proc:error] Cannot run program "tesseract" (in directory
> "/usr/local/bin"): error=2, No such file or directory
>
> The command "tesseract" works at the command line. I suspect there may be
> a problem with permissions: is there a way to overcome this error? Thanks.
>
> Best,
> Giuseppe
>
>
>
>
>
>


Re: [basex-talk] proc:system

2019-11-13 Thread Giuseppe G. A. Celano
I would expect just some text, as with proc:system("ls")


  Usage:
  /usr/local/bin/tesseract --help | --help-extra | --version
  /usr/local/bin/tesseract --list-langs
  /usr/local/bin/tesseract imagename outputbase [options...] [configfile...]

OCR options:
  -l LANG[+LANG]Specify language(s) used for OCR.
NOTE: These options must occur before any configfile.

Single options:
  --helpShow this help message.
  --help-extra  Show extra help for advanced users.
  --version Show version information.
  --list-langs  List available languages for tesseract engine.

  1


I guess that the "1" code blocks the printing. If I use 
proc:system("/usr/local/bin/tesseract", "--help"), it works.

E-mail: cel...@informatik.uni-leipzig.de 

Web site 1: http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano 
 
Web site 2: https://sites.google.com/site/giuseppegacelano/ 


> On Nov 13, 2019, at 6:48 PM, Christian Grün  wrote:
> 
>> Interestingly, proc:execute("/usr/local/bin/tesseract") works (I have BaseX 
>> 9.2).
> 
> How does the output look like?
> 
>> proc:system("/usr/local/bin/tesseract") returns the following:
> 
> If the code 1 is raised, it indicates that your command will be
> executed indeed, but it returns the exit code 1. Which output would
> you expect?
> 
> 
> 
> 
> 
>>> SET DEBUG true
>> DEBUG: true
>>> XQUERY proc:system("/usr/local/bin/tesseract")
>> org.basex.query.QueryException:
>> at org.basex.query.func.proc.ProcSystem.item(ProcSystem.java:26)
>> at org.basex.query.expr.ParseExpr.value(ParseExpr.java:50)
>> at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:45)
>> at org.basex.query.scope.MainModule.iter(MainModule.java:97)
>> at org.basex.query.QueryContext.iter(QueryContext.java:332)
>> at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90)
>> at org.basex.core.cmd.AQuery.query(AQuery.java:107)
>> at org.basex.core.cmd.XQuery.run(XQuery.java:22)
>> at org.basex.core.Command.run(Command.java:257)
>> at org.basex.core.Command.execute(Command.java:93)
>> at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
>> at org.basex.api.client.Session.execute(Session.java:36)
>> at org.basex.core.CLI.execute(CLI.java:92)
>> at org.basex.core.CLI.execute(CLI.java:76)
>> at org.basex.BaseX.console(BaseX.java:176)
>> at org.basex.BaseX.(BaseX.java:151)
>> at org.basex.BaseX.main(BaseX.java:42)
>> org.basex.core.BaseXException: Stopped at ., 1/12:
>> [proc:code0001]
>> at org.basex.core.Command.execute(Command.java:94)
>> at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
>> at org.basex.api.client.Session.execute(Session.java:36)
>> at org.basex.core.CLI.execute(CLI.java:92)
>> at org.basex.core.CLI.execute(CLI.java:76)
>> at org.basex.BaseX.console(BaseX.java:176)
>> at org.basex.BaseX.(BaseX.java:151)
>> at org.basex.BaseX.main(BaseX.java:42)
>> Caused by: org.basex.query.QueryException:
>> at org.basex.query.func.proc.ProcSystem.item(ProcSystem.java:26)
>> at org.basex.query.expr.ParseExpr.value(ParseExpr.java:50)
>> at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:45)
>> at org.basex.query.scope.MainModule.iter(MainModule.java:97)
>> at org.basex.query.QueryContext.iter(QueryContext.java:332)
>> at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90)
>> at org.basex.core.cmd.AQuery.query(AQuery.java:107)
>> at org.basex.core.cmd.XQuery.run(XQuery.java:22)
>> at org.basex.core.Command.run(Command.java:257)
>> at org.basex.core.Command.execute(Command.java:93)
>> ... 7 more
>> Stopped at ., 1/12:
>> [proc:code0001]
>> 
>> 
>> 
>> 
>> On Nov 13, 2019, at 5:50 PM, Christian Grün  
>> wrote:
>> 
>> Hi Giuseppe,
>> 
>> When I try to run
>> proc:system("/usr/local/bin/tesseract") I get the error [proc:code0001]
>> 
>> 
>> On my system, I get the (expected) error…
>> 
>> [proc:error] Cannot run program "/usr/local/bin/tesseract":
>> CreateProcess error=2, Das System kann die angegebene Datei nicht finden
>> 
>> …so we may need to find out what code 1 means in your case. Could you
>> run the query with debugging enabled and pass us on the stack trace?
>> 
>> And your error code indicates that you are using an older version of
>> BaseX. Does it work with a more recent version? If not, what do you
>> get?
>> 
>> Best,
>> Christian
>> 
>> 
>> 
>> 
>> 
>> Similarly:
>> 
>> proc:system("tesseract") returns [proc:error] Cannot run program 
>> "tesseract": error=2, No such file or directory
>> 
>> Similarly:
>> 
>> proc:system("tesseract", (), map {"dir" : "/usr/local/bin/"}) returns 
>> [proc:error] Cannot run program "tesseract" (in directory "/usr/local/bin"): 
>> error=2, No such file or directory
>> 
>> The command "tesseract" works at the command line. I suspect there may be a 
>> problem with permissions: is there a way to overcome this error? Thanks.
>> 

Re: [basex-talk] proc:system

2019-11-13 Thread Christian Grün
> Interestingly, proc:execute("/usr/local/bin/tesseract") works (I have BaseX 
> 9.2).

How does the output look like?

> proc:system("/usr/local/bin/tesseract") returns the following:

If the code 1 is raised, it indicates that your command will be
executed indeed, but it returns the exit code 1. Which output would
you expect?





> > SET DEBUG true
> DEBUG: true
> > XQUERY proc:system("/usr/local/bin/tesseract")
> org.basex.query.QueryException:
> at org.basex.query.func.proc.ProcSystem.item(ProcSystem.java:26)
> at org.basex.query.expr.ParseExpr.value(ParseExpr.java:50)
> at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:45)
> at org.basex.query.scope.MainModule.iter(MainModule.java:97)
> at org.basex.query.QueryContext.iter(QueryContext.java:332)
> at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90)
> at org.basex.core.cmd.AQuery.query(AQuery.java:107)
> at org.basex.core.cmd.XQuery.run(XQuery.java:22)
> at org.basex.core.Command.run(Command.java:257)
> at org.basex.core.Command.execute(Command.java:93)
> at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
> at org.basex.api.client.Session.execute(Session.java:36)
> at org.basex.core.CLI.execute(CLI.java:92)
> at org.basex.core.CLI.execute(CLI.java:76)
> at org.basex.BaseX.console(BaseX.java:176)
> at org.basex.BaseX.(BaseX.java:151)
> at org.basex.BaseX.main(BaseX.java:42)
> org.basex.core.BaseXException: Stopped at ., 1/12:
> [proc:code0001]
> at org.basex.core.Command.execute(Command.java:94)
> at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
> at org.basex.api.client.Session.execute(Session.java:36)
> at org.basex.core.CLI.execute(CLI.java:92)
> at org.basex.core.CLI.execute(CLI.java:76)
> at org.basex.BaseX.console(BaseX.java:176)
> at org.basex.BaseX.(BaseX.java:151)
> at org.basex.BaseX.main(BaseX.java:42)
> Caused by: org.basex.query.QueryException:
> at org.basex.query.func.proc.ProcSystem.item(ProcSystem.java:26)
> at org.basex.query.expr.ParseExpr.value(ParseExpr.java:50)
> at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:45)
> at org.basex.query.scope.MainModule.iter(MainModule.java:97)
> at org.basex.query.QueryContext.iter(QueryContext.java:332)
> at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90)
> at org.basex.core.cmd.AQuery.query(AQuery.java:107)
> at org.basex.core.cmd.XQuery.run(XQuery.java:22)
> at org.basex.core.Command.run(Command.java:257)
> at org.basex.core.Command.execute(Command.java:93)
> ... 7 more
> Stopped at ., 1/12:
> [proc:code0001]
>
>
>
>
> On Nov 13, 2019, at 5:50 PM, Christian Grün  wrote:
>
> Hi Giuseppe,
>
> When I try to run
> proc:system("/usr/local/bin/tesseract") I get the error [proc:code0001]
>
>
> On my system, I get the (expected) error…
>
> [proc:error] Cannot run program "/usr/local/bin/tesseract":
> CreateProcess error=2, Das System kann die angegebene Datei nicht finden
>
> …so we may need to find out what code 1 means in your case. Could you
> run the query with debugging enabled and pass us on the stack trace?
>
> And your error code indicates that you are using an older version of
> BaseX. Does it work with a more recent version? If not, what do you
> get?
>
> Best,
> Christian
>
>
>
>
>
> Similarly:
>
> proc:system("tesseract") returns [proc:error] Cannot run program "tesseract": 
> error=2, No such file or directory
>
> Similarly:
>
> proc:system("tesseract", (), map {"dir" : "/usr/local/bin/"}) returns 
> [proc:error] Cannot run program "tesseract" (in directory "/usr/local/bin"): 
> error=2, No such file or directory
>
> The command "tesseract" works at the command line. I suspect there may be a 
> problem with permissions: is there a way to overcome this error? Thanks.
>
> Best,
> Giuseppe
>
>
>


Re: [basex-talk] I am looking for the fastest way to sort 2.4 Mio tags by two attribute ascending and descending

2019-11-13 Thread Christian Grün
Hi Omar,

> I am not 100% sure what redundant expressions you saw in my code. Is this 
> about using reverse() instead of having two for loops?

In your initial query, the path…

collection('_qdb-TEI-02__cache')//*[@order="none"]/_:d

…was evaluated four times. If you bind it to a variable, it will only
be evaluated once. In addition, using child steps instead // is
faster, too (in many cases, BaseX will rewrite your path for you).

> I don't quite get how I would do incremental changes to the entries ordered 
> by a key. I so an incremental update by just getting the updated pre values 
> for the database that was changed. That is reasonably fast even with 
> incremental attribute index update.

Just two ideas: You can store the data sets of your main database in a
pre-sorted fashion. Incremental entries can be sorted on-the-fly in
your query, and the results can then be merged with the sorted entries
of the main database. Another approach is to store the references and
the index keys in your index database. The incremental entries can be
merged with the sorted index entries (by looking at the index keys,
which are available in both data structures).

> I was not sure what is a lot of data in BaseX.

True, that’s difficult to tell in general; it always depends on the context.

> ... ! db:open-pre(./@db_name, ./@pre)

In BaseX 9.3, it will be possible to supply integer sequences as
second argument; this may speed up your query a little.

Best,
Christian


Re: [basex-talk] proc:system

2019-11-13 Thread Giuseppe G. A. Celano
Hi Christian,

Interestingly, proc:execute("/usr/local/bin/tesseract") works (I have BaseX 
9.2).

proc:system("/usr/local/bin/tesseract") returns the following:

> SET DEBUG true
DEBUG: true
> XQUERY proc:system("/usr/local/bin/tesseract")
org.basex.query.QueryException: 
at org.basex.query.func.proc.ProcSystem.item(ProcSystem.java:26)
at org.basex.query.expr.ParseExpr.value(ParseExpr.java:50)
at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:45)
at org.basex.query.scope.MainModule.iter(MainModule.java:97)
at org.basex.query.QueryContext.iter(QueryContext.java:332)
at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90)
at org.basex.core.cmd.AQuery.query(AQuery.java:107)
at org.basex.core.cmd.XQuery.run(XQuery.java:22)
at org.basex.core.Command.run(Command.java:257)
at org.basex.core.Command.execute(Command.java:93)
at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
at org.basex.api.client.Session.execute(Session.java:36)
at org.basex.core.CLI.execute(CLI.java:92)
at org.basex.core.CLI.execute(CLI.java:76)
at org.basex.BaseX.console(BaseX.java:176)
at org.basex.BaseX.(BaseX.java:151)
at org.basex.BaseX.main(BaseX.java:42)
org.basex.core.BaseXException: Stopped at ., 1/12:
[proc:code0001] 
at org.basex.core.Command.execute(Command.java:94)
at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
at org.basex.api.client.Session.execute(Session.java:36)
at org.basex.core.CLI.execute(CLI.java:92)
at org.basex.core.CLI.execute(CLI.java:76)
at org.basex.BaseX.console(BaseX.java:176)
at org.basex.BaseX.(BaseX.java:151)
at org.basex.BaseX.main(BaseX.java:42)
Caused by: org.basex.query.QueryException: 
at org.basex.query.func.proc.ProcSystem.item(ProcSystem.java:26)
at org.basex.query.expr.ParseExpr.value(ParseExpr.java:50)
at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:45)
at org.basex.query.scope.MainModule.iter(MainModule.java:97)
at org.basex.query.QueryContext.iter(QueryContext.java:332)
at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90)
at org.basex.core.cmd.AQuery.query(AQuery.java:107)
at org.basex.core.cmd.XQuery.run(XQuery.java:22)
at org.basex.core.Command.run(Command.java:257)
at org.basex.core.Command.execute(Command.java:93)
... 7 more
Stopped at ., 1/12:
[proc:code0001] 




> On Nov 13, 2019, at 5:50 PM, Christian Grün  wrote:
> 
> Hi Giuseppe,
> 
>> When I try to run
>> proc:system("/usr/local/bin/tesseract") I get the error [proc:code0001]
> 
> On my system, I get the (expected) error…
> 
> [proc:error] Cannot run program "/usr/local/bin/tesseract":
> CreateProcess error=2, Das System kann die angegebene Datei nicht finden
> 
> …so we may need to find out what code 1 means in your case. Could you
> run the query with debugging enabled and pass us on the stack trace?
> 
> And your error code indicates that you are using an older version of
> BaseX. Does it work with a more recent version? If not, what do you
> get?
> 
> Best,
> Christian
> 
> 
> 
> 
>> 
>> Similarly:
>> 
>> proc:system("tesseract") returns [proc:error] Cannot run program 
>> "tesseract": error=2, No such file or directory
>> 
>> Similarly:
>> 
>> proc:system("tesseract", (), map {"dir" : "/usr/local/bin/"}) returns 
>> [proc:error] Cannot run program "tesseract" (in directory "/usr/local/bin"): 
>> error=2, No such file or directory
>> 
>> The command "tesseract" works at the command line. I suspect there may be a 
>> problem with permissions: is there a way to overcome this error? Thanks.
>> 
>> Best,
>> Giuseppe
>> 
> 



Re: [basex-talk] Docker query

2019-11-13 Thread Bogdan Bogucki

Hi Joshep,

Docker image does not contain webapp folder.

You have to copy it to file system form zip archive and pass path to run 
command.


|docker run -d \ --name basexhttp \ --publish 1984:1984 \ --publish 
8984:8984 \ --volume "$HOME/basex/data":/srv/basex/data \ --volume 
"$HOME/basex/webapp":/srv/basex/webapp \ basex/basexhttp:latest|


Please take a look:

https://hub.docker.com/r/basex/basexhttp

Regards

Bogdan Bogucki
W dniu 13.11.2019 o 17:52, Christian Grün pisze:

Hi Joseph,

Docker is not included in the default installations of BaseX. I
haven’t tried it by myself, but you could have a look at our
documentation and see what needs to be done to get the DBA application
running [1].

Best
Christian

[1] http://docs.basex.org/wiki/Docker


On Fri, Nov 8, 2019 at 10:03 AM Joseph Szili  wrote:

Hello all,  I'm trying to use the docker image for 9.x but when I surf to 
http://localhost:8984/dba/ per the documentation ...

I get the following response in the browser

No function found that matches the request.


Using this command in linux (Ubuntu 19.04)

›$ docker run -d \
 --name basexhttp \
 --publish 1984:1984 \
 --publish 8984:8984 \
 --volume "$HOME/Projects/basex-dev/data":/srv/basex/data \
 basex/basexhttp:latest


container log >>
/srv/basex/.basex: writing new configuration file.
BaseX 9.3 beta [HTTP Server]
[main] INFO org.eclipse.jetty.util.log - Logging initialized @248ms to 
org.eclipse.jetty.util.log.Slf4jLog
[main] INFO org.eclipse.jetty.server.Server - jetty-9.4.21.v20190926; built: 
2019-09-26T16:41:09.154Z; git: 72970db61a2904371e1218a95a3bef5d79788c33; jvm 
1.8.0_212-b04
[main] INFO org.eclipse.jetty.util.TypeUtil - JVM Runtime does not support 
Modules
[main] INFO org.eclipse.jetty.webapp.StandardDescriptorProcessor - NO JSP 
Support for /, did not find org.eclipse.jetty.jsp.JettyJspServlet
[main] INFO org.eclipse.jetty.server.session - DefaultSessionIdManager 
workerName=node0
[main] INFO org.eclipse.jetty.server.session - No SessionScavenger set, using 
defaults
[main] INFO org.eclipse.jetty.server.session - node0 Scavenging every 60ms
Server was started (port: 1984).
java.io.FileNotFoundException: /srv/basex/data/.logs/2019-11-08.log (No such 
file or directory)
 at java.io.FileOutputStream.open0(Native Method)
 at java.io.FileOutputStream.open(FileOutputStream.java:270)
 at java.io.FileOutputStream.(FileOutputStream.java:213)
 at org.basex.server.LogFile.create(LogFile.java:31)
 at org.basex.server.Log.write(Log.java:128)
 at org.basex.server.Log.writeServer(Log.java:70)
 at org.basex.BaseXServer.(BaseXServer.java:122)
 at org.basex.http.HTTPContext.init(HTTPContext.java:101)
 at org.basex.http.BaseXServlet.init(BaseXServlet.java:37)
 at 
org.eclipse.jetty.servlet.ServletHolder$WrapperServlet.init(ServletHolder.java:1287)
 at 
org.eclipse.jetty.servlet.ServletHolder.initServlet(ServletHolder.java:599)
 at 
org.eclipse.jetty.servlet.ServletHolder.initialize(ServletHolder.java:425)
 at 
org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:751)
 at 
java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:352)
 at 
java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:483)
 at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
 at 
java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:312)
 at 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743)
 at 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
 at 
java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
 at 
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:744)
 at 
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:361)
 at 
org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1443)
 at 
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1407)
 at 
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:821)
 at 
org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:276)
 at 
org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:524)
 at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
 at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
 at org.eclipse.jetty.server.Server.start(Server.java:407)
 at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
 at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:106)
 at 

Re: [basex-talk] Docker query

2019-11-13 Thread Christian Grün
Hi Joseph,

Docker is not included in the default installations of BaseX. I
haven’t tried it by myself, but you could have a look at our
documentation and see what needs to be done to get the DBA application
running [1].

Best
Christian

[1] http://docs.basex.org/wiki/Docker


On Fri, Nov 8, 2019 at 10:03 AM Joseph Szili  wrote:
>
> Hello all,  I'm trying to use the docker image for 9.x but when I surf to 
> http://localhost:8984/dba/ per the documentation ...
>
> I get the following response in the browser
>
> No function found that matches the request.
>
>
> Using this command in linux (Ubuntu 19.04)
>
> ›$ docker run -d \
> --name basexhttp \
> --publish 1984:1984 \
> --publish 8984:8984 \
> --volume "$HOME/Projects/basex-dev/data":/srv/basex/data \
> basex/basexhttp:latest
>
>
> container log >>
> /srv/basex/.basex: writing new configuration file.
> BaseX 9.3 beta [HTTP Server]
> [main] INFO org.eclipse.jetty.util.log - Logging initialized @248ms to 
> org.eclipse.jetty.util.log.Slf4jLog
> [main] INFO org.eclipse.jetty.server.Server - jetty-9.4.21.v20190926; built: 
> 2019-09-26T16:41:09.154Z; git: 72970db61a2904371e1218a95a3bef5d79788c33; jvm 
> 1.8.0_212-b04
> [main] INFO org.eclipse.jetty.util.TypeUtil - JVM Runtime does not support 
> Modules
> [main] INFO org.eclipse.jetty.webapp.StandardDescriptorProcessor - NO JSP 
> Support for /, did not find org.eclipse.jetty.jsp.JettyJspServlet
> [main] INFO org.eclipse.jetty.server.session - DefaultSessionIdManager 
> workerName=node0
> [main] INFO org.eclipse.jetty.server.session - No SessionScavenger set, using 
> defaults
> [main] INFO org.eclipse.jetty.server.session - node0 Scavenging every 60ms
> Server was started (port: 1984).
> java.io.FileNotFoundException: /srv/basex/data/.logs/2019-11-08.log (No such 
> file or directory)
> at java.io.FileOutputStream.open0(Native Method)
> at java.io.FileOutputStream.open(FileOutputStream.java:270)
> at java.io.FileOutputStream.(FileOutputStream.java:213)
> at org.basex.server.LogFile.create(LogFile.java:31)
> at org.basex.server.Log.write(Log.java:128)
> at org.basex.server.Log.writeServer(Log.java:70)
> at org.basex.BaseXServer.(BaseXServer.java:122)
> at org.basex.http.HTTPContext.init(HTTPContext.java:101)
> at org.basex.http.BaseXServlet.init(BaseXServlet.java:37)
> at 
> org.eclipse.jetty.servlet.ServletHolder$WrapperServlet.init(ServletHolder.java:1287)
> at 
> org.eclipse.jetty.servlet.ServletHolder.initServlet(ServletHolder.java:599)
> at 
> org.eclipse.jetty.servlet.ServletHolder.initialize(ServletHolder.java:425)
> at 
> org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:751)
> at 
> java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:352)
> at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:483)
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> at 
> java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:312)
> at 
> java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743)
> at 
> java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
> at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
> at 
> org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:744)
> at 
> org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:361)
> at 
> org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1443)
> at 
> org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1407)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:821)
> at 
> org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:276)
> at 
> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:524)
> at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
> at 
> org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
> at org.eclipse.jetty.server.Server.start(Server.java:407)
> at 
> org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
> at 
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:106)
> at org.eclipse.jetty.server.Server.doStart(Server.java:371)
> at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
> at org.basex.BaseXHTTP.(BaseXHTTP.java:120)
> at org.basex.BaseXHTTP.main(BaseXHTTP.java:52)
> [main] INFO org.eclipse.jetty.server.handler.ContextHandler - Started 
> 

Re: [basex-talk] proc:system

2019-11-13 Thread Christian Grün
Hi Giuseppe,

> When I try to run
> proc:system("/usr/local/bin/tesseract") I get the error [proc:code0001]

On my system, I get the (expected) error…

[proc:error] Cannot run program "/usr/local/bin/tesseract":
CreateProcess error=2, Das System kann die angegebene Datei nicht finden

…so we may need to find out what code 1 means in your case. Could you
run the query with debugging enabled and pass us on the stack trace?

And your error code indicates that you are using an older version of
BaseX. Does it work with a more recent version? If not, what do you
get?

Best,
Christian




>
> Similarly:
>
> proc:system("tesseract") returns [proc:error] Cannot run program "tesseract": 
> error=2, No such file or directory
>
> Similarly:
>
> proc:system("tesseract", (), map {"dir" : "/usr/local/bin/"}) returns 
> [proc:error] Cannot run program "tesseract" (in directory "/usr/local/bin"): 
> error=2, No such file or directory
>
> The command "tesseract" works at the command line. I suspect there may be a 
> problem with permissions: is there a way to overcome this error? Thanks.
>
> Best,
> Giuseppe
>


Re: [basex-talk] http:send-request - problem with response

2019-11-13 Thread Christian Grün
Hi Bogdan (cc to the list),

Thanks for digging deeper. I noticed that the standard Java function
that we used returned only one value per header field, and dropped the
others [1].

I managed to fix this in the latest stable snapshot; could you give it
a try [2]?

BaseX 9.3 will be released end of November.

Cheers,
Christian

[1] http://files.basex.org/releases/latest/
[2] https://github.com/BaseXdb/basex/issues/1751




On Wed, Nov 13, 2019 at 4:26 PM Bogdan Bogucki  wrote:
>
> Hi Christian,
> Lack of many redirection is not a big problem. I can handle it manually.
> Problem is not complete response header form first request.
>
> Browser make 3 redirection I am talking about first.
>
> Please take a look.
>
>
> Response header from browser (first request) - contains three fields Set
> Cookies:
>
> set-cookie __cfduid=daf54110b2a87d66c2a53…; domain=.pracuj.pl; HttpOnly
>
> set-cookie _yaic=13; expires=Fri, 31-Dec- 23:59:59 GMT; path=/
>
> set-cookie _urnadiam=A; domain=.pracuj.pl…3:00:00 GMT; path=/; HttpOnly
>
> but response header form http:send-request contains only last one:
>
> 
>
> I need information from previous set-cookie fields to make manual
> redirection passing information about sessions to next requests.
>
> Regards
>
> Bogdan Bogucki
>
> W dniu 29.10.2019 o 03:06, Christian Grün pisze:
> > Hi Bogdan,
> >
> > The current http:send-request implementation is based on the default
> > Java HttpURLConnection, which does not resolve redirects that use
> > different protocols [1]. This is the reason why your request will not
> > be fully processed (as it e.g. happens when you use the browser).
> >
> > It seems that your initial request to the https protocol returns a 302
> > redirect to a (now unsafe) http URL, which returns another redirect to
> > https. I don’t know who maintains the discussed web site, but it could
> > be worth contacting the admins and asking them if they could update
> > and simplify their redirect policy.
> >
> > Hope this helps,
> > Christian
> >
> > [1] https://stackoverflow.com/a/1884427/1018790
> >
> >
> >
> >
> > On Fri, Oct 25, 2019 at 1:52 PM Bogdan Bogucki  wrote:
> >>
> >> Hello,
> >>
> >> I encounter problem with http:send-request function. I need handle 
> >> multiple forward requests (3) with cookies.
> >>
> >> First request returns cookies information which are required with next 
> >> requests. Output from http:send-request dosen't contains all fields which 
> >> are returned by server.
> >>
> >> Response from browser is flowing:
> >> cache-control
> >> private
> >> cf-cache-status
> >> DYNAMIC
> >> cf-ray
> >> 52b3cedc5b9bcc9f-WAW
> >> content-type
> >> text/html; charset=utf-8
> >> date
> >> Fri, 25 Oct 2019 11:21:37 GMT
> >> expect-ct
> >> max-age=604800, report-uri="ht….com/cdn-cgi/beacon/expect-ct"
> >> location
> >> http://www.pracuj.pl/praca/jav…eloper-warszawa,oferta,7171988
> >> server
> >> cloudflare
> >> set-cookie
> >> __cfduid=daf54110b2a87d66c2a53…; domain=.pracuj.pl; HttpOnly
> >> set-cookie
> >> _yaic=13; expires=Fri, 31-Dec- 23:59:59 GMT; path=/
> >> set-cookie
> >> _urnadiam=A; domain=.pracuj.pl…3:00:00 GMT; path=/; HttpOnly
> >> x-aspnet-version
> >> 4.0.30319
> >> x-aspnetmvc-version
> >> 5.2
> >> X-Firefox-Spdy
> >> h2
> >> x-powered-by
> >> ASP.NET
> >> x-ua-compatible
> >> IE=edge
> >>
> >>
> >> Request:
> >>
> >> http:send-request(
> >>  
> >>
> >> >> value="text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"/>
> >>
> >>
> >>
> >>
> >>
> >>  , 
> >> 'https://www.pracuj.pl/praca/java-developer-warszawa,oferta,7171988')
> >>
> >> Response form http:send-request is:
> >>
> >> http://expath.org/ns/http-client; status="302" 
> >> message="Found">
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"/>
> >>
> >>
> >>
> >> 
> >>
> >> Two set-cookie fields is missing.
> >>
> >> How should I invoke http:send-request to receive proper result  ?
> >>
> >> Regards
> >>
> >> Bogdan


Re: [basex-talk] I am looking for the fastest way to sort 2.4 Mio tags by two attribute ascending and descending

2019-11-13 Thread Omar Siam

Dear Christian,

Thank you, your suggestion is indeed 4s faster on my machine than my 
code. This is quite impressive. Now I am below 20s. Not ideal but a good 
start. If I try hard to not do this if not necessary then I am willing 
to leave it at that.


I also tried some ideas from your code but with the traditional for 
loop. Looks like that is even faster:


declare namespace _ = "https://www.oeaw.ac.at/acdh/tools/vle/util;;

let $sorted-ascending := for $key in 
collection('_qdb-TEI-02__cache')//*[@order="none"]/_:d/@vutlsk

  order by data($key) ascending
  return $key
let $sorted-ascending-archiv := for $key in 
collection('_qdb-TEI-02__cache')//*[@order="none"]/_:d/@vutlsk-archiv

  order by data($key) ascending
  return $key
return (db:replace("_qdb-TEI-02__cache", 'ascending_cache.xml', <_:dryed 
order="ascending" ids="{string-join(subsequence($sorted-ascending, 1, 
15000)/../(@ID, @xml:id), ' ')}"/>),
db:replace("_qdb-TEI-02__cache", 'descending_cache.xml', <_:dryed 
order="descending" 
ids="{string-join(subsequence(reverse($sorted-ascending), 1, 
15000)/../(@ID, @xml:id), ' ')}"/>),
db:replace("_qdb-TEI-02__cache", 'ascending-archiv_cache.xml', <_:dryed 
order="ascending" label="archiv" 
ids="{string-join(subsequence($sorted-ascending-archiv, 1, 
15000)/../(@ID, @xml:id), ' ')}"/>),
db:replace("_qdb-TEI-02__cache", 'descending-archiv_cache.xml', <_:dryed 
order="descending" label="archiv" 
ids="{string-join(subsequence(reverse($sorted-ascending-archiv), 1, 
15000)/../(@ID, @xml:id), ' ')}"/>))


This takes only 11s on my machine.

One thing I think I also saw previously: parent axis is rather slow. Do 
you agree with that or am I imagining something?


Some replies to your comments below:



Some spontaneous ideas:

• You could try to evaluate redundant expressions once and bind them 
to a variable instead (see the attached code).
I am not 100% sure what redundant expressions you saw in my code. Is 
this about using reverse() instead of having two for loops?
• You could save each document to a separate database via db:create 
(depending on your data, this may be faster than replacements in a 
single database), or save all new elements in a single document.
I tried that now and it does not make a difference whether I do a 
db:replace in _qdb-TEI-02__cache or create separate dbs for each 
document with db:create. I already adjusted attrinclude so it ignores 
the ids attribute.
• Instead of creating full index structures with each update 
operation, you may save a lot of time if you only update parts of the 
data that have actually changed.
I thought about that but could not imagine how to do that. The most 
probable change that is affecting the sort order is something like 
removing a space at the start or a ( or changing the first letter. Doing 
any minimal update here would probably still mean to sort the 2.4 Mio 
entries.
• If that’s close to impossible (because the types of updates are too 
manifold), you could work with daily databases that only contain 
incremental changes, and merge them with the main database every night.
I don't quite get how I would do incremental changes to the entries 
ordered by a key. I so an incremental update by just getting the updated 
pre values for the database that was changed. That is reasonably fast 
even with incremental attribute index update.


2,4 million tags are a lot, though; and the string length of the 
created attribute values seem to exceed 100.000 characters, which is a 
lot, too. What will you do with the resulting documents?


As I mentioned this is a custom index to a set of databases containing 
2,4 million TEI entry elements with data. These are more than 700 
databases with about 3500 entries each and updates happen to one of 
them. This is quite fast.


I was not sure what is a lot of data in BaseX. I had a feeling that my 
dataset is not medium sized anymore but I am not sure what the size of 
datasets is that should give reasonable performance. I have to say that 
searching this data in BaseX proved to be a very fast and pleasant 
experience. Just editing it entry by entry is tricky.


This really big attributes string values are one part of a two step 
lookup I want to use to get a paging feature (at least for some out of 
the 2.4 mio entries).


The RESTXQ user can ask for a result with 10, 25, 100 entries per page 
and specify a page in alphabetical order of one of the sort keys. Worst 
case is the user deos not specify any other filte criteria. If she does 
then things are fast enough in all my realistic scenarios invloving only 
2 or two databases so aroung 7000 index entries. I implemented getting a 
page out of all entries with sorting and subsequence. But that means it 
takes 8s or more for the first page to show. That is to long.


Using this code

declare namespace _ = "https://www.oeaw.ac.at/acdh/tools/vle/util;;
let $all := collection("_qdb-TEI-02__cache")//_:dryed[@order='ascending' 
and not(@label)]/tokenize(@ids)
return 

[basex-talk] proc:system

2019-11-13 Thread Giuseppe G. A. Celano
Hi,

When I try to run 

proc:system("/usr/local/bin/tesseract") I get the error [proc:code0001] 

Similarly:

proc:system("tesseract") returns [proc:error] Cannot run program "tesseract": 
error=2, No such file or directory

Similarly:

proc:system("tesseract", (), map {"dir" : "/usr/local/bin/"}) returns 
[proc:error] Cannot run program "tesseract" (in directory "/usr/local/bin"): 
error=2, No such file or directory

The command "tesseract" works at the command line. I suspect there may be a 
problem with permissions: is there a way to overcome this error? Thanks.

Best,
Giuseppe



Re: [basex-talk] Is there an API that provides XQuery compilation results?

2019-11-13 Thread Zimmel, Daniel
With xquery:parse(), I wondered about this myself, so I second this question 
without having an answer :-(

You will get a query plan if it is a non-static error:

xquery:parse("let $a :=  return $a + 1", map 
{'compile':true(),'pass':true()})

… but what the pass option does is not clear to me. Definitely not passing the 
type error here.

If you only need to catch the error message without the query plan, there’s 
always try/catch of course.

Best, Daniel

Von: Peter Villadsen 
Gesendet: Mittwoch, 13. November 2019 00:36
An: basex-talk@mailman.uni-konstanz.de
Betreff: [basex-talk] Is there an API that provides XQuery compilation results?

When I use the GUI application I can see some error description (even if it a 
little terse) when my queries are incorrect. For instance I might get:

Stopped at tableFields.xq, 27/50:
[XQST0118] Different start and end tag: 

If I submit a query like 

However, I cannot seem to find an API that brings me that information? I tried:

xquery:parse("", map {'pass':true()})

but that did not get the result I expected.  Also, while we’re at at it, it 
would be nice to also get the optimized query.


Best Regards

Peter Villadsen
Principal Architect
Microsoft Business Applications Group