Re: limit on resultset from a select query?

2018-04-23 Thread Paul Hermans
Did some further investigation on a local system with these results.

Union query: 870 Mb in 3:09
First part: 870 Mb in 3:20
Second part: 0 Mb in 26 seconds.

Will repeat when I’m back at the client.


Paul


On 20 April 2018 at 15:52:18, Rob Vesse 
(rve...@dotnetrdf.org<mailto:rve...@dotnetrdf.org>) wrote:

Ah ok

The comment about the UNION is interesting.

Remember that Jena lazily evaluates and streams query results, therefore it is 
possible that once it exhausts one side of the union it has to do quite a lot 
of work before it starts finding results on the other side of the union.

One way to debug would be to run the two sides of the union as separate 
independent queries and observe how long to start receiving results for each 
query. If the latter side has a significant delay in returning the first result 
it would tend to imply that the hypothesis above is correct.

Rob

On 20/04/2018, 14:37, "Paul Hermans" <p...@proxml.be> wrote:

Hi,

Some more investigation done.

The good news is that we get to the end. Although one needs to be patient.
There is a substantial pause at 160Mb. But after some time the process resumes.

We do not see something special in the logs.

The query itself is using a union.
We will try to find out if this is influential.


Paul




Kind Regards,
Paul Hermans
-
ProXML bvba
Linked Data services
KBO: http://data.kbodata.be/organisation/0476_068_080#id
(w) www.proxml.be<http://www.proxml.be/>
(e) p...@proxml.be<mailto:p...@proxml.be>
(tw) @PaulZH
(t) +32 15 23 00 76
(m) +32 473 66 03 20
Narcisweg 17
3140 Keerbergen
Belgium

On Friday, Apr 20, 2018 at 1:20 PM, Andy Seaborne 
<a...@apache.org<mailto:a...@apache.org>> wrote:
There is no specific limitation in the Fseki code but every intermediate proxy 
can truncate output. Is there anything in the Fuseki log file? Is Fuseki 
running as a WAR file or standalone server? Andy On 20/04/18 11:49, Rob Vesse 
wrote: > Paul > > Can you get a Java thread dump for the Fuseki process to see 
what it is doing when it is hung? > > Rob > > On 20/04/2018, 11:42, "Paul 
Hermans" wrote: > > It is 160MB downloaded CSV. > > Paul > > On 20 Apr 2018, at 
11:31, Laura Morales > wrote: > > Is that 160MB of downloaded CSV file, or 160 
Million triples? > > > > > Sent: Friday, April 20, 2018 at 11:25 AM > From: 
"Paul Hermans" > > To: "Jena Users" > > Subject: limit on resultset from a 
select query? > Dear, > > > Fuseki version: 3.4.0. > > Context: Doing a select 
query to generate a tidy csv file for each class in a database. > > The return 
always seems to stop/hang at 160M. > > > > % Total % Received % Xferd Average 
Speed Time Time Time Current > > Dload Upload Total Spent Left Speed > > 100 
160M 0 160M 0 0 424k 0 --:--:-- 0:06:26 --:--:-- 0 > > > Looking at the output 
the ending line is 995825 which is truncated > 995824 
https:///2007_00422997000115/aangifte/identificatie#id,http://purl.org/dc/terms/modified,2017-08-12T20:54:49.344
 > 995825 https:///2007_00422997000115/aangifte/ > > I looked through the 
documentation for a configuration setting that could be related to this with no 
success. > > > > > Kind Regards, > Paul > > Kind Regards, > > Paul Hermans > > 
- > > ProXML bvba > > Linked Data services > > KBO: 
http://data.kbodata.be/organisation/0476_068_080#id > > (w) www.proxml.be > (e) 
p...@proxml.be > (tw) @PaulZH > (t) +32 15 23 00 76 > (m) +32 473 66 03 20 > 
Narcisweg 17 > 3140 Keerbergen > Belgium > > > > > > > > >






Kind Regards,
Paul Hermans
-
ProXML bvba
Linked Data services
KBO: http://data.kbodata.be/organisation/0476_068_080#id
(w) www.proxml.be<http://www.proxml.be/>
(e) p...@proxml.be<mailto:p...@proxml.be>
(tw) @PaulZH
(t) +32 15 23 00 76
(m) +32 473 66 03 20
Narcisweg 17
3140 Keerbergen
Belgium

ODEdu – Innovative Open Data Education and Training based on PBL and Learning 
Analytics - http://odedu-project.eu/
OpenGovIntelligence – Public Administration Modernization by exploiting Linked 
Open Statistical Data - 
http://www.opengovintelligence.eu<http://www.opengovintelligence.eu/>
OpenCube – Linked Open Statistical Data - http://opencube-project.eu/


Re: limit on resultset from a select query?

2018-04-20 Thread Rob Vesse
Ah ok

The comment about the UNION is interesting.  

Remember that Jena lazily evaluates and streams query results, therefore it is 
possible that once it exhausts one side of the union it has to do quite a lot 
of work before it starts finding results on the other side of the union.

One way to debug would be to run the two sides of the union as separate 
independent queries and observe how long to start receiving results for each 
query.  If the latter side has a significant delay in returning the first 
result it would tend to imply that the hypothesis above is correct.

Rob

On 20/04/2018, 14:37, "Paul Hermans" <p...@proxml.be> wrote:

Hi,

Some more investigation done.

The good news is that we get to the end. Although one needs to be patient.
There is a substantial pause at 160Mb. But after some time the process 
resumes.

We do not see something special in the logs.

The query itself is using a union.
We will try to find out if this is influential.


Paul




Kind Regards,
Paul Hermans
-
ProXML bvba
Linked Data services
KBO: http://data.kbodata.be/organisation/0476_068_080#id
(w) www.proxml.be<http://www.proxml.be/>
(e) p...@proxml.be<mailto:p...@proxml.be>
(tw) @PaulZH
(t) +32 15 23 00 76
(m) +32 473 66 03 20
Narcisweg 17
3140 Keerbergen
Belgium

On Friday, Apr 20, 2018 at 1:20 PM, Andy Seaborne 
<a...@apache.org<mailto:a...@apache.org>> wrote:
There is no specific limitation in the Fseki code but every intermediate 
proxy can truncate output. Is there anything in the Fuseki log file? Is Fuseki 
running as a WAR file or standalone server? Andy On 20/04/18 11:49, Rob Vesse 
wrote: > Paul > > Can you get a Java thread dump for the Fuseki process to see 
what it is doing when it is hung? > > Rob > > On 20/04/2018, 11:42, "Paul 
Hermans" wrote: > > It is 160MB downloaded CSV. > > Paul > > On 20 Apr 2018, at 
11:31, Laura Morales > wrote: > > Is that 160MB of downloaded CSV file, or 160 
Million triples? > > > > > Sent: Friday, April 20, 2018 at 11:25 AM > From: 
"Paul Hermans" > > To: "Jena Users" > > Subject: limit on resultset from a 
select query? > Dear, > > > Fuseki version: 3.4.0. > > Context: Doing a select 
query to generate a tidy csv file for each class in a database. > > The return 
always seems to stop/hang at 160M. > > > > % Total % Received % Xferd Average 
Speed Time Time Time Current > > Dload Upload Total Spent Left Speed > > 100 
160M 0 160M 0 0 424k 0 --:--:-- 0:06:26 --:--:-- 0 > > > Looking at the output 
the ending line is 995825 which is truncated > 995824 
https:///2007_00422997000115/aangifte/identificatie#id,http://purl.org/dc/terms/modified,2017-08-12T20:54:49.344
 > 995825 https:///2007_00422997000115/aangifte/ > > I looked through the 
documentation for a configuration setting that could be related to this with no 
success. > > > > > Kind Regards, > Paul > > Kind Regards, > > Paul Hermans > > 
- > > ProXML bvba > > Linked Data services > > KBO: 
http://data.kbodata.be/organisation/0476_068_080#id > > (w) www.proxml.be > (e) 
p...@proxml.be > (tw) @PaulZH > (t) +32 15 23 00 76 > (m) +32 473 66 03 20 > 
Narcisweg 17 > 3140 Keerbergen > Belgium > > > > > > > > >








Re: limit on resultset from a select query?

2018-04-20 Thread Paul Hermans
Hi,

Some more investigation done.

The good news is that we get to the end. Although one needs to be patient.
There is a substantial pause at 160Mb. But after some time the process resumes.

We do not see something special in the logs.

The query itself is using a union.
We will try to find out if this is influential.


Paul




Kind Regards,
Paul Hermans
-
ProXML bvba
Linked Data services
KBO: http://data.kbodata.be/organisation/0476_068_080#id
(w) www.proxml.be<http://www.proxml.be/>
(e) p...@proxml.be<mailto:p...@proxml.be>
(tw) @PaulZH
(t) +32 15 23 00 76
(m) +32 473 66 03 20
Narcisweg 17
3140 Keerbergen
Belgium

On Friday, Apr 20, 2018 at 1:20 PM, Andy Seaborne 
<a...@apache.org<mailto:a...@apache.org>> wrote:
There is no specific limitation in the Fseki code but every intermediate proxy 
can truncate output. Is there anything in the Fuseki log file? Is Fuseki 
running as a WAR file or standalone server? Andy On 20/04/18 11:49, Rob Vesse 
wrote: > Paul > > Can you get a Java thread dump for the Fuseki process to see 
what it is doing when it is hung? > > Rob > > On 20/04/2018, 11:42, "Paul 
Hermans" wrote: > > It is 160MB downloaded CSV. > > Paul > > On 20 Apr 2018, at 
11:31, Laura Morales > wrote: > > Is that 160MB of downloaded CSV file, or 160 
Million triples? > > > > > Sent: Friday, April 20, 2018 at 11:25 AM > From: 
"Paul Hermans" > > To: "Jena Users" > > Subject: limit on resultset from a 
select query? > Dear, > > > Fuseki version: 3.4.0. > > Context: Doing a select 
query to generate a tidy csv file for each class in a database. > > The return 
always seems to stop/hang at 160M. > > > > % Total % Received % Xferd Average 
Speed Time Time Time Current > > Dload Upload Total Spent Left Speed > > 100 
160M 0 160M 0 0 424k 0 --:--:-- 0:06:26 --:--:-- 0 > > > Looking at the output 
the ending line is 995825 which is truncated > 995824 
https:///2007_00422997000115/aangifte/identificatie#id,http://purl.org/dc/terms/modified,2017-08-12T20:54:49.344
 > 995825 https:///2007_00422997000115/aangifte/ > > I looked through the 
documentation for a configuration setting that could be related to this with no 
success. > > > > > Kind Regards, > Paul > > Kind Regards, > > Paul Hermans > > 
- > > ProXML bvba > > Linked Data services > > KBO: 
http://data.kbodata.be/organisation/0476_068_080#id > > (w) www.proxml.be > (e) 
p...@proxml.be > (tw) @PaulZH > (t) +32 15 23 00 76 > (m) +32 473 66 03 20 > 
Narcisweg 17 > 3140 Keerbergen > Belgium > > > > > > > > >



Re: limit on resultset from a select query?

2018-04-20 Thread Paul Hermans
Hi Rob,

Thanks for reaching out.
Our current hypothesis is that it is network related. Will check out this 
afternoon at the client’s premises.

If not, I’ll go after the dump.


Many thanks.


Paul



Kind Regards,
Paul Hermans
-
ProXML bvba
Linked Data services
KBO: http://data.kbodata.be/organisation/0476_068_080#id
(w) www.proxml.be<http://www.proxml.be/>
(e) p...@proxml.be<mailto:p...@proxml.be>
(tw) @PaulZH
(t) +32 15 23 00 76
(m) +32 473 66 03 20
Narcisweg 17
3140 Keerbergen
Belgium

On Friday, Apr 20, 2018 at 12:57 PM, Rob Vesse 
<rve...@dotnetrdf.org<mailto:rve...@dotnetrdf.org>> wrote:
Paul Can you get a Java thread dump for the Fuseki process to see what it is 
doing when it is hung? Rob On 20/04/2018, 11:42, "Paul Hermans" wrote: It is 
160MB downloaded CSV. Paul On 20 Apr 2018, at 11:31, Laura Morales > wrote: Is 
that 160MB of downloaded CSV file, or 160 Million triples? Sent: Friday, April 
20, 2018 at 11:25 AM From: "Paul Hermans" > To: "Jena Users" > Subject: limit 
on resultset from a select query? Dear, Fuseki version: 3.4.0. Context: Doing a 
select query to generate a tidy csv file for each class in a database. The 
return always seems to stop/hang at 160M. % Total % Received % Xferd Average 
Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 160M 0 
160M 0 0 424k 0 --:--:-- 0:06:26 --:--:-- 0 Looking at the output the ending 
line is 995825 which is truncated 995824 
https:///2007_00422997000115/aangifte/identificatie#id,http://purl.org/dc/terms/modified,2017-08-12T20:54:49.344
 995825 https:///2007_00422997000115/aangifte/ I looked through the 
documentation for a configuration setting that could be related to this with no 
success. Kind Regards, Paul Kind Regards, Paul Hermans 
- ProXML bvba Linked Data services KBO: 
http://data.kbodata.be/organisation/0476_068_080#id (w) www.proxml.be (e) 
p...@proxml.be (tw) @PaulZH (t) +32 15 23 00 76 (m) +32 473 66 03 20 Narcisweg 
17 3140 Keerbergen Belgium



Re: limit on resultset from a select query?

2018-04-20 Thread Paul Hermans
Hi Rob,

Thanks for reaching out.
Our current hypothesis is that this is network related. We are first checking 
this.

If not, I’ll go after the dump.


Paul

On 20 Apr 2018, at 12:49, Rob Vesse 
<rve...@dotnetrdf.org<mailto:rve...@dotnetrdf.org>> wrote:

Paul

Can you get a Java thread dump for the Fuseki process to see what it is doing 
when it is hung?

Rob

On 20/04/2018, 11:42, "Paul Hermans" <p...@proxml.be<mailto:p...@proxml.be>> 
wrote:

   It is 160MB downloaded CSV.

   Paul

   On 20 Apr 2018, at 11:31, Laura Morales 
<laure...@mail.com<mailto:laure...@mail.com><mailto:laure...@mail.com>> wrote:

   Is that 160MB of downloaded CSV file, or 160 Million triples?




   Sent: Friday, April 20, 2018 at 11:25 AM
   From: "Paul Hermans" 
<p...@proxml.be<mailto:p...@proxml.be><mailto:p...@proxml.be>>
   To: "Jena Users" 
<users@jena.apache.org<mailto:users@jena.apache.org><mailto:users@jena.apache.org>>
   Subject: limit on resultset from a select query?
   Dear,


   Fuseki version: 3.4.0.

   Context: Doing a select query to generate a tidy csv file for each class in 
a database.

   The return always seems to stop/hang at 160M.



   % Total % Received % Xferd Average Speed Time Time Time Current

   Dload Upload Total Spent Left Speed

   100 160M 0 160M 0 0 424k 0 --:--:-- 0:06:26 --:--:-- 0


   Looking at the output the ending line is 995825 which is truncated
   995824 
https:///2007_00422997000115/aangifte/identificatie#id,http://purl.org/dc/terms/modified,2017-08-12T20:54:49.344
   995825 
https:///2007_00422997000115/aangifte/<https://id.milieuinfo.be/imjv/dossier/2007_00422997000115/aangifte/[https://id.milieuinfo.be/imjv/dossier/2007_00422997000115/aangifte/]>

   I looked through the documentation for a configuration setting that could be 
related to this with no success.




   Kind Regards,
   Paul

   Kind Regards,

   Paul Hermans

   -

   ProXML bvba

   Linked Data services

   KBO: http://data.kbodata.be/organisation/0476_068_080#id

   (w) www.proxml.be<http://www.proxml.be><http://www.proxml.be>
   (e) p...@proxml.be<mailto:p...@proxml.be><mailto:p...@proxml.be>
   (tw)  @PaulZH
   (t)  +32 15 23 00 76
   (m) +32 473 66 03 20
   Narcisweg 17
   3140 Keerbergen
   Belgium










Kind Regards,

Paul Hermans

-

ProXML bvba

Linked Data services

KBO: http://data.kbodata.be/organisation/0476_068_080#id

(w) www.proxml.be<http://www.proxml.be>
(e) p...@proxml.be<mailto:p...@proxml.be>
(tw)  @PaulZH
(t)  +32 15 23 00 76
(m) +32 473 66 03 20
Narcisweg 17
3140 Keerbergen
Belgium






Re: limit on resultset from a select query?

2018-04-20 Thread Andy Seaborne
There is no specific limitation in the Fseki code but every intermediate 
proxy can truncate output.


Is there anything in the Fuseki log file?
Is Fuseki running as a WAR file or standalone server?

Andy


On 20/04/18 11:49, Rob Vesse wrote:

Paul

Can you get a Java thread dump for the Fuseki process to see what it is doing 
when it is hung?

Rob

On 20/04/2018, 11:42, "Paul Hermans" <p...@proxml.be> wrote:

 It is 160MB downloaded CSV.
 
 Paul
 
 On 20 Apr 2018, at 11:31, Laura Morales <laure...@mail.com<mailto:laure...@mail.com>> wrote:
 
 Is that 160MB of downloaded CSV file, or 160 Million triples?
 
 
 
 
 Sent: Friday, April 20, 2018 at 11:25 AM

 From: "Paul Hermans" <p...@proxml.be<mailto:p...@proxml.be>>
 To: "Jena Users" <users@jena.apache.org<mailto:users@jena.apache.org>>
 Subject: limit on resultset from a select query?
 Dear,
 
 
 Fuseki version: 3.4.0.
 
 Context: Doing a select query to generate a tidy csv file for each class in a database.
 
 The return always seems to stop/hang at 160M.
 
 
 
 % Total % Received % Xferd Average Speed Time Time Time Current
 
 Dload Upload Total Spent Left Speed
 
 100 160M 0 160M 0 0 424k 0 --:--:-- 0:06:26 --:--:-- 0
 
 
 Looking at the output the ending line is 995825 which is truncated

 995824 
https:///2007_00422997000115/aangifte/identificatie#id,http://purl.org/dc/terms/modified,2017-08-12T20:54:49.344
 995825 
https:///2007_00422997000115/aangifte/<https://id.milieuinfo.be/imjv/dossier/2007_00422997000115/aangifte/[https://id.milieuinfo.be/imjv/dossier/2007_00422997000115/aangifte/]>
 
 I looked through the documentation for a configuration setting that could be related to this with no success.
 
 
 
 
 Kind Regards,

 Paul
 
 Kind Regards,
 
 Paul Hermans
 
 -
 
 ProXML bvba
 
 Linked Data services
 
 KBO: http://data.kbodata.be/organisation/0476_068_080#id
 
 (w) www.proxml.be<http://www.proxml.be>

 (e) p...@proxml.be<mailto:p...@proxml.be>
 (tw)  @PaulZH
 (t)  +32 15 23 00 76
 (m) +32 473 66 03 20
 Narcisweg 17
 3140 Keerbergen
 Belgium
 
 
 
 
 







Re: limit on resultset from a select query?

2018-04-20 Thread Rob Vesse
Paul

Can you get a Java thread dump for the Fuseki process to see what it is doing 
when it is hung?

Rob

On 20/04/2018, 11:42, "Paul Hermans" <p...@proxml.be> wrote:

It is 160MB downloaded CSV.

Paul

On 20 Apr 2018, at 11:31, Laura Morales 
<laure...@mail.com<mailto:laure...@mail.com>> wrote:

Is that 160MB of downloaded CSV file, or 160 Million triples?




Sent: Friday, April 20, 2018 at 11:25 AM
From: "Paul Hermans" <p...@proxml.be<mailto:p...@proxml.be>>
To: "Jena Users" <users@jena.apache.org<mailto:users@jena.apache.org>>
Subject: limit on resultset from a select query?
Dear,


Fuseki version: 3.4.0.

Context: Doing a select query to generate a tidy csv file for each class in 
a database.

The return always seems to stop/hang at 160M.



% Total % Received % Xferd Average Speed Time Time Time Current

Dload Upload Total Spent Left Speed

100 160M 0 160M 0 0 424k 0 --:--:-- 0:06:26 --:--:-- 0


Looking at the output the ending line is 995825 which is truncated
995824 
https:///2007_00422997000115/aangifte/identificatie#id,http://purl.org/dc/terms/modified,2017-08-12T20:54:49.344
995825 
https:///2007_00422997000115/aangifte/<https://id.milieuinfo.be/imjv/dossier/2007_00422997000115/aangifte/[https://id.milieuinfo.be/imjv/dossier/2007_00422997000115/aangifte/]>

I looked through the documentation for a configuration setting that could 
be related to this with no success.




Kind Regards,
Paul

Kind Regards,

Paul Hermans

-

ProXML bvba

Linked Data services

KBO: http://data.kbodata.be/organisation/0476_068_080#id

(w) www.proxml.be<http://www.proxml.be>
(e) p...@proxml.be<mailto:p...@proxml.be>
(tw)  @PaulZH
(t)  +32 15 23 00 76
(m) +32 473 66 03 20
Narcisweg 17
3140 Keerbergen
Belgium











Re: limit on resultset from a select query?

2018-04-20 Thread Paul Hermans
It is 160MB downloaded CSV.

Paul

On 20 Apr 2018, at 11:31, Laura Morales 
<laure...@mail.com<mailto:laure...@mail.com>> wrote:

Is that 160MB of downloaded CSV file, or 160 Million triples?




Sent: Friday, April 20, 2018 at 11:25 AM
From: "Paul Hermans" <p...@proxml.be<mailto:p...@proxml.be>>
To: "Jena Users" <users@jena.apache.org<mailto:users@jena.apache.org>>
Subject: limit on resultset from a select query?
Dear,


Fuseki version: 3.4.0.

Context: Doing a select query to generate a tidy csv file for each class in a 
database.

The return always seems to stop/hang at 160M.



% Total % Received % Xferd Average Speed Time Time Time Current

Dload Upload Total Spent Left Speed

100 160M 0 160M 0 0 424k 0 --:--:-- 0:06:26 --:--:-- 0


Looking at the output the ending line is 995825 which is truncated
995824 
https:///2007_00422997000115/aangifte/identificatie#id,http://purl.org/dc/terms/modified,2017-08-12T20:54:49.344
995825 
https:///2007_00422997000115/aangifte/<https://id.milieuinfo.be/imjv/dossier/2007_00422997000115/aangifte/[https://id.milieuinfo.be/imjv/dossier/2007_00422997000115/aangifte/]>

I looked through the documentation for a configuration setting that could be 
related to this with no success.




Kind Regards,
Paul

Kind Regards,

Paul Hermans

-

ProXML bvba

Linked Data services

KBO: http://data.kbodata.be/organisation/0476_068_080#id

(w) www.proxml.be<http://www.proxml.be>
(e) p...@proxml.be<mailto:p...@proxml.be>
(tw)  @PaulZH
(t)  +32 15 23 00 76
(m) +32 473 66 03 20
Narcisweg 17
3140 Keerbergen
Belgium






limit on resultset from a select query?

2018-04-20 Thread Paul Hermans
Dear,


Fuseki version: 3.4.0.

Context: Doing a select query to generate a tidy csv file for each class in a 
database.

The return always seems to stop/hang at 160M.



% Total% Received % Xferd  Average Speed   TimeTime Time  Current

 Dload  Upload   Total   SpentLeft  Speed

100  160M0  160M0 0   424k  0 --:--:--  0:06:26 --:--:-- 0


Looking at the output the ending line is 995825 which is truncated
995824 
https:///2007_00422997000115/aangifte/identificatie#id,http://purl.org/dc/terms/modified,2017-08-12T20:54:49.344
995825 
https:///2007_00422997000115/aangifte/

I looked through the documentation for a configuration setting that could be 
related to this with no success.




Kind Regards,
Paul