Re: [basex-talk] xquery result limit

2016-11-11 Thread Christian Grün
As I overlooked in your last example, you did not use 'order by'.
Without sorting, only the number of requested results will be created,
no matter if you use the GUI or work on command-line. The most
prominent example of this is the following query (which would be
extremely slow and memory consuming otherwise):

  (1 to 1000)[1]

If you use 'order by', it’s always recommendable to only return the
minimum set of required of information, and create the full result in
the subsequent step:

  for $result in (
for $y in ...lots of stuff...
order by ...
return $y
  )[position() = 1 to 100]
  return { $result }

One more trick: You can move your future result in a function and
evaluate it afterwards:

  for $func in (
for $i in 1 to 10
order by $i descending
return function() { { $i } }
  )[position() = 1 to 5]
  return $func()

Hope this helps,
Christian
__

On Fri, Nov 11, 2016 at 6:01 PM, George Sofianos  wrote:
>
>> No problem! I am just asking because large results in the query will
>> first be cached before they are displayed in the GUI. On command-line,
>> single items will be iteratively output as soon as possible. As a
>> consequence, outputting zour 900,000 rows shouldn’t cause additional
>> overhead on command-line, but it will increase memory consumption in
>> the GUI.
>>
>> Hope this helps,
>> C.
>
> Thanks, that explains the memory consumption and the delay (about 8 seconds)
> while outputing to the GUI window.
> So If I get it right, when I use [position() = 1 to 100], only the first 100
> results are calculated? or all 900.000 rows are calculated, and I get the
> first 100 results? (imagine it is a complex query)
>
> (for $x in $xml//something-complex[complex-xpath]
> let $y := another-complex-function()
> where (another-complex-comparison)
> return
> 
> {$y}
> )[position() = 1 to 100]


Re: [basex-talk] xquery result limit

2016-11-11 Thread Liam R. E. Quin
On Fri, 2016-11-11 at 19:01 +0200, George Sofianos wrote:
> So If I get it right, when I use [position() = 1 to 100], only the
> first 100 results are calculated? or all 900.000 rows are calculated,
> and I get the first 100 results? (imagine it is a complex query)

Note that an order by clause would force everything to be created &
sorted in any case.

Liam



Re: [basex-talk] xquery result limit

2016-11-11 Thread Christian Grün
> So If I get it right, when I use [position() = 1 to 100], only the first 100
> results are calculated? or all 900.000 rows are calculated, and I get the
> first 100 results? (imagine it is a complex query)

All  elements will be created, but only the first 100 will
need to be cached in the GUI.

I agree there might be some chance for further optimizations here.
Volunteers are welcome!


Re: [basex-talk] xquery result limit

2016-11-11 Thread George Sofianos



No problem! I am just asking because large results in the query will
first be cached before they are displayed in the GUI. On command-line,
single items will be iteratively output as soon as possible. As a
consequence, outputting zour 900,000 rows shouldn’t cause additional
overhead on command-line, but it will increase memory consumption in
the GUI.

Hope this helps,
C.
Thanks, that explains the memory consumption and the delay (about 8 
seconds) while outputing to the GUI window.
So If I get it right, when I use [position() = 1 to 100], only the first 
100 results are calculated? or all 900.000 rows are calculated, and I 
get the first 100 results? (imagine it is a complex query)


(for $x in $xml//something-complex[complex-xpath]
let $y := another-complex-function()
where (another-complex-comparison)
return

{$y}
)[position() = 1 to 100]


Re: [basex-talk] xquery result limit

2016-11-11 Thread Christian Grün
> I'm testing the scripts on GUI, I don't really use command line. I also run
> them on a basexhttp instance.
> I will check it out, however I like to keep the scripts as close to the
> xquery spec as possible.

No problem! I am just asking because large results in the query will
first be cached before they are displayed in the GUI. On command-line,
single items will be iteratively output as soon as possible. As a
consequence, outputting zour 900,000 rows shouldn’t cause additional
overhead on command-line, but it will increase memory consumption in
the GUI.

Hope this helps,
C.


Re: [basex-talk] Exporting large table from RDBMS

2016-11-11 Thread Christian Grün
Thanks! This looks like a blocker:

> {
> $sql-results !
> 
> {./sql:column[lower-case(@name)="id"]/string()}
> {./sql:column[lower-case(@name)="name"]/string()}
> {./sql:column[lower-case(@name)="timestamp"]/string()}
> {./sql:column[lower-case(@name)="value"]/string()}
> 
> }
> 
> };

As you are wrapping a root element around all results, everything
needs to be cached in main memory and thus prevents streaming.


Re: [basex-talk] xquery result limit

2016-11-11 Thread George Sofianos



Do you run the query in the GUI or on command-line?

For even better performance, I recommend you to have a look at the
following HOF function:

http://docs.basex.org/wiki/Hof_Module#hof:top-k-by
I'm testing the scripts on GUI, I don't really use command line. I also 
run them on a basexhttp instance.
I will check it out, however I like to keep the scripts as close to the 
xquery spec as possible.


Re: [basex-talk] xquery result limit

2016-11-11 Thread Christian Grün
> Hi, thanks for quick reply. I'm more interested on the memory consumption
> and execution speed. I have a result of about 90 rows and it looks like
> limiting the result like that also helps on the performance aspect. But I'm
> not sure about that. Is the whole result saved in memory then gets garbage
> collected, or the process is much smarter?

Do you run the query in the GUI or on command-line?

For even better performance, I recommend you to have a look at the
following HOF function:

   http://docs.basex.org/wiki/Hof_Module#hof:top-k-by


Re: [basex-talk] Exporting large table from RDBMS

2016-11-11 Thread Marco Lettere

Hi Christian,
as usual thanks for the attention.
I send you the query [1], the transformation function [2] and the 
stacktrace [4].
Database is MySQL 5.7.13 with jdbc driver 
mysql-connector-java-5.1.39-bin.jar. Everything deployed on a single 
host running win7 32 bit.

Thanks again!
Marco.

[1] The query:
declare function j:export($table as xs:string) {
let $connection := sql:connect(conf:get("jdbc.url"), 
conf:get("jdbc.user"), conf:get("jdbc.pass"))

let $results := sql:execute($connection, "SELECT * FROM " || $table)
let $close := sql:close($connection)
return $results
};

[2] The transformation function
declare function ioi:to-cvs($sql-results as node()+) {
{
$sql-results !

{./sql:column[lower-case(@name)="id"]/string()}
{./sql:column[lower-case(@name)="name"]/string()}
{./sql:column[lower-case(@name)="timestamp"]/string()} 


{./sql:column[lower-case(@name)="value"]/string()}

}

};

[3] The exception
java.lang.OutOfMemoryError: Java heap space
at java.lang.Long.toString(Unknown Source)
at java.lang.Long.toString(Unknown Source)
at java.math.BigInteger.smallToString(Unknown Source)
at java.math.BigInteger.toString(Unknown Source)
at java.math.BigInteger.toString(Unknown Source)
at org.basex.query.func.sql.SqlExecute$1.next(SqlExecute.java:94)
at org.basex.query.iter.Iter.value(Iter.java:58)
at org.basex.query.expr.ParseExpr.value(ParseExpr.java:68)
at org.basex.query.QueryContext.value(QueryContext.java:410)
at org.basex.query.expr.gflwor.Let$LetEval.next(Let.java:183)
at org.basex.query.expr.gflwor.GFLWOR$1.next(GFLWOR.java:95)
at org.basex.query.expr.IterMap$1.next(IterMap.java:48)
at org.basex.query.iter.Iter.value(Iter.java:51)
at org.basex.query.expr.ParseExpr.value(ParseExpr.java:68)
at org.basex.query.QueryContext.value(QueryContext.java:410)
at org.basex.query.expr.gflwor.Let$LetEval.next(Let.java:183)
at org.basex.query.expr.gflwor.GFLWOR.value(GFLWOR.java:71)
at org.basex.query.QueryContext.value(QueryContext.java:410)
at org.basex.query.func.StaticFunc.invValue(StaticFunc.java:178)
at org.basex.query.func.FuncCall.invoke(FuncCall.java:87)
at org.basex.query.func.FuncCall.value(FuncCall.java:61)
at org.basex.query.QueryContext.value(QueryContext.java:410)
at org.basex.query.expr.gflwor.Let$LetEval.next(Let.java:183)
at org.basex.query.expr.gflwor.GFLWOR$1.next(GFLWOR.java:95)
at org.basex.query.expr.gflwor.GFLWOR$1.next(GFLWOR.java:92)
at org.basex.query.MainModule$1.next(MainModule.java:121)
at org.basex.http.restxq.RestXqResponse.create(RestXqResponse.java:55)
at org.basex.http.restxq.RestXqModule.process(RestXqModule.java:105)
at 
org.basex.http.restxq.RestXqFunction.process(RestXqFunction.java:107)

at org.basex.http.restxq.RestXqServlet.run(RestXqServlet.java:44)
at org.basex.http.BaseXServlet.service(BaseXServlet.java:65)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)


On 11/11/2016 14:27, Christian Grün wrote:

…more input would be welcome ;)


Marco, I've faced the same problem and frustration, and I agree that it
would be great if BaseX could be used for such tasks. Strengthening its
potential role as a data integration engine.
Kind regards,
Hans


Marco Lettere  schrieb am 10:57 Freitag, 11.November
2016:


Hi all,

I'm currently facing the issue of exporting a large table stored in a
MySQL instance to a csv that has then to be zipped and served through a
web service.

I'd love to use XQuery and the Basex functionality for doing it because
it feels perfectly natural.

The problem is that the XQuery script that performs the SQL Query and
transformation to CSV crashes with an Out of Memory exception.

I know how to increase memory and so on but I'd like to find a stable
solution that is as less as possible dependent from the system requirements.

Currently the only idea is to page the operation and (maybe exploiting
parallel execution) build an intermediate result on disk files that will
be assembled before delivery.

Isn't there any more elegant solution based somehow on the streaming
functionality or anything like this?

Thanks for any advice you can provide.

Regards,

Marco.







Re: [basex-talk] xquery result limit

2016-11-11 Thread Christian Grün
> Hi, I've found this very good answer to limiting results in xquery.
> http://stackoverflow.com/a/8900472/1951487
> I like that it works, but I was wondering if you can explain what happens in
> the background?

Hm… What do you want to know exactly? ;) What does our query
processor? How does evaluation differ from complete evaluation? What
does the query mean?


[basex-talk] xquery result limit

2016-11-11 Thread George Sofianos
Hi, I've found this very good answer to limiting results in xquery. 
http://stackoverflow.com/a/8900472/1951487
I like that it works, but I was wondering if you can explain what 
happens in the background?


Thanks,

George



Re: [basex-talk] Exporting large table from RDBMS

2016-11-11 Thread Christian Grün
…more input would be welcome ;)

> Marco, I've faced the same problem and frustration, and I agree that it
> would be great if BaseX could be used for such tasks. Strengthening its
> potential role as a data integration engine.
> Kind regards,
> Hans
>
>
> Marco Lettere  schrieb am 10:57 Freitag, 11.November
> 2016:
>
>
> Hi all,
>
> I'm currently facing the issue of exporting a large table stored in a
> MySQL instance to a csv that has then to be zipped and served through a
> web service.
>
> I'd love to use XQuery and the Basex functionality for doing it because
> it feels perfectly natural.
>
> The problem is that the XQuery script that performs the SQL Query and
> transformation to CSV crashes with an Out of Memory exception.
>
> I know how to increase memory and so on but I'd like to find a stable
> solution that is as less as possible dependent from the system requirements.
>
> Currently the only idea is to page the operation and (maybe exploiting
> parallel execution) build an intermediate result on disk files that will
> be assembled before delivery.
>
> Isn't there any more elegant solution based somehow on the streaming
> functionality or anything like this?
>
> Thanks for any advice you can provide.
>
> Regards,
>
> Marco.
>
>
>


Re: [basex-talk] Exporting large table from RDBMS

2016-11-11 Thread Hans-Juergen Rennau
Marco, I've faced the same problem and frustration, and I agree that it would 
be great if BaseX could be used for such tasks. Strengthening its potential 
role as a data integration engine.Kind regards,Hans
 

Marco Lettere  schrieb am 10:57 Freitag, 11.November 
2016:
 

 Hi all,

I'm currently facing the issue of exporting a large table stored in a 
MySQL instance to a csv that has then to be zipped and served through a 
web service.

I'd love to use XQuery and the Basex functionality for doing it because 
it feels perfectly natural.

The problem is that the XQuery script that performs the SQL Query and 
transformation to CSV crashes with an Out of Memory exception.

I know how to increase memory and so on but I'd like to find a stable 
solution that is as less as possible dependent from the system requirements.

Currently the only idea is to page the operation and (maybe exploiting 
parallel execution) build an intermediate result on disk files that will 
be assembled before delivery.

Isn't there any more elegant solution based somehow on the streaming 
functionality or anything like this?

Thanks for any advice you can provide.

Regards,

Marco.



   

Re: [basex-talk] Exporting large table from RDBMS

2016-11-11 Thread Christian Grün
Hi Marco,

Generally, the BaseX SQL Module is capable of streaming results. It is
usually up to the corresponding SQL Java driver how large result sets
are handled. Maybe you can provide us with the Java stack trace; it
could help finding the culprit.

Cheers,
Christian


On Fri, Nov 11, 2016 at 10:56 AM, Marco Lettere  wrote:
> Hi all,
>
> I'm currently facing the issue of exporting a large table stored in a MySQL
> instance to a csv that has then to be zipped and served through a web
> service.
>
> I'd love to use XQuery and the Basex functionality for doing it because it
> feels perfectly natural.
>
> The problem is that the XQuery script that performs the SQL Query and
> transformation to CSV crashes with an Out of Memory exception.
>
> I know how to increase memory and so on but I'd like to find a stable
> solution that is as less as possible dependent from the system requirements.
>
> Currently the only idea is to page the operation and (maybe exploiting
> parallel execution) build an intermediate result on disk files that will be
> assembled before delivery.
>
> Isn't there any more elegant solution based somehow on the streaming
> functionality or anything like this?
>
> Thanks for any advice you can provide.
>
> Regards,
>
> Marco.
>


Re: [basex-talk] expath-pkg.xml on the server-side not recognized

2016-11-11 Thread Christian Grün
Hm. Some more information would help ;) Cold you prepare an MCVE?


On Fri, Nov 11, 2016 at 2:15 PM, Rob Stapper  wrote:
> Hi,
>
>
>
> It’s been a while. Doing other stuff. But in my spare time I’m still using
> Basex.
>
>
>
> I’m fiddling around with client-server concepts over http and I find that on
> the server-side, the expath-pkg.xml  is ignored.  Modules in repo can’t be
> addressed by means of expath-pkg.
>
> I wonder if this is by design or is it something that should work.
>
>
>
> TIA for your reply.
>
>
>
> Rob Stapper


[basex-talk] expath-pkg.xml on the server-side not recognized

2016-11-11 Thread Rob Stapper
Hi,

 

It's been a while. Doing other stuff. But in my spare time I'm still using
Basex. 

 

I'm fiddling around with client-server concepts over http and I find that on
the server-side, the expath-pkg.xml  is ignored.  Modules in repo can't be
addressed by means of expath-pkg.

I wonder if this is by design or is it something that should work.

 

TIA for your reply.

 

Rob Stapper



[basex-talk] Exporting large table from RDBMS

2016-11-11 Thread Marco Lettere

Hi all,

I'm currently facing the issue of exporting a large table stored in a 
MySQL instance to a csv that has then to be zipped and served through a 
web service.


I'd love to use XQuery and the Basex functionality for doing it because 
it feels perfectly natural.


The problem is that the XQuery script that performs the SQL Query and 
transformation to CSV crashes with an Out of Memory exception.


I know how to increase memory and so on but I'd like to find a stable 
solution that is as less as possible dependent from the system requirements.


Currently the only idea is to page the operation and (maybe exploiting 
parallel execution) build an intermediate result on disk files that will 
be assembled before delivery.


Isn't there any more elegant solution based somehow on the streaming 
functionality or anything like this?


Thanks for any advice you can provide.

Regards,

Marco.