Re: Fuseki growing in size and need for compaction

2024-04-23 Thread Mikael Pesonen

We have the same issue and have stopped using Jena for that reason.

On 22/04/2024 18:22, Balduin Landolt wrote:

Hello,

we're running Fuseki 5.0.0 (but previously the last 4.x versions behaved
essentially the same) with roughly 40 Mio triples (tendency: growing).
Not sure what configuration is relevant, but we have the default graph as
the union graph.
Also, we use Fuseki as our main database, not just as a "view on our data"
so we do quite a bit of updating on the data all the time.

Lately, we've been having more and more issues with servers running out of
disk space because Fuseki's database grew pretty rapidly.
This can be solved by compacting the DB, but with our data and hardware
this takes ca. 15 minutes, during which Fuseki does not accept any update
queries, so for the production system we can't really do this outside of
nighttime hours when (hopefully) no one uses the system anyways.

Some things we've noticed:
- A subset of our data (I think ~20 Mio triples) taking up 6GB in compacted
state, when dumped to a .trig file is ca. 5GB. But when uploading the same
.trig file to an empty DB, this grows to ca. 25GB
- Dropping graphs does not free up disk space
- A sequence of e.g. 10k queries updating only a small number of triples
(maybe 1-10 or so) on the full dataset seems to grow the DB size a lot,
like 10s to 100s of GB (I don't have numbers on this one, but it was
substantial).

My question is:
Would that kind of growth in disk usage be expected? Are other people
having similar issues? Are there strategies to mitigate this? Maybe some
configuration that may be tweaked or so?

Best & thanks in advance,
Balduin


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Text indexing stopped working

2023-11-30 Thread Mikael Pesonen
Unfortunately that's all there is. There was nothing in the log at the 
time of last index update.


On 30/11/2023 13.56, Andy Seaborne wrote:

There isn't much information to go.

On 29/11/2023 09:50, Mikael Pesonen wrote:

No idea?

On 16/11/2023 13.11, Mikael Pesonen wrote:
What could be the reason why new data is suddenly not added to text 
index and not found with Jena text queries?


The newest files in Jena text index folder are zero sized

_b_Lucene85FieldsIndexfile_pointers_n.tmp
_b_Lucene85FieldsIndex-doc_ids_m.tmp

dated 2023-11-13 although I have added lots of data since then using 
same methods as before. Text queries find all the data before this 
date.


BR




--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Text indexing stopped working

2023-11-29 Thread Mikael Pesonen

No idea?

On 16/11/2023 13.11, Mikael Pesonen wrote:
What could be the reason why new data is suddenly not added to text 
index and not found with Jena text queries?


The newest files in Jena text index folder are zero sized

_b_Lucene85FieldsIndexfile_pointers_n.tmp
_b_Lucene85FieldsIndex-doc_ids_m.tmp

dated 2023-11-13 although I have added lots of data since then using 
same methods as before. Text queries find all the data before this date.


BR




Text indexing stopped working

2023-11-16 Thread Mikael Pesonen
What could be the reason why new data is suddenly not added to text 
index and not found with Jena text queries?


The newest files in Jena text index folder are zero sized

_b_Lucene85FieldsIndexfile_pointers_n.tmp
_b_Lucene85FieldsIndex-doc_ids_m.tmp

dated 2023-11-13 although I have added lots of data since then using 
same methods as before. Text queries find all the data before this date.


BR


Re: Jena hangs on deleted files

2023-10-05 Thread Mikael Pesonen
We have the same issue now, here's a truncated lsof: 
https://pastebin.com/rkUvEK0f


So Data-0001 size was 86G and after compact with delete, Data-0002 was 
8G. Data-0001 is not released although the folder doesn't exist anymore. 
So df shows they take about 100G. Another issue is of course the 
bloating of data folder to 10x. I guess that can't be prevented?




On 12/09/2023 12.12, Rob @ DNR wrote:

Well, yes, there shouldn’t be, but that wasn’t what Andy suggested/asked.

Have you verified that nothing else is holding references to those files in any 
way e.g.

lsof | grep /path/to/your/db

And checked that only a single Java process is listed in the output?

We don’t know your deployment environment, it could be some mundane background 
process (e.g. anti-virus, search indexer) running on your system, it could be a 
bug in the particular JVM you are using, or something else entirely but without 
any more details we can only guess at possibilities

Another long shot is that it could be a hardware issue, if you’re running the 
database on an SSD it could be driver optimisation to not actually delete files 
until the holding process exits to avoid unnecessary write operations and 
prolong the life of the drive

Rob

From: Mikael Pesonen 
Date: Monday, 11 September 2023 at 12:17
To: users@jena.apache.org 
Subject: Re: Jena hangs on deleted files
There should not be other processes accessing the files. When jena is
restarted, space from deleted files is released.

On 09/09/2023 18.56, Andy Seaborne wrote:

This situation could be related to the other issues you're reported
(corrupted node tables) if some other Linux process,not nece3ssarily
java) is accessing the files.

A process holding them open will stop them becoming recyclable by the OS.

 Andy

On 08/09/2023 13:09, Mikael Pesonen wrote:

Just on a command line (dev system)

/usr/bin/java -Xmx8G -jar fuseki-server.jar --update --port 3030
--config=../jena_config/fuseki_config.ttl


On 08/09/2023 11.47, Andy Seaborne wrote:

In a container? As a VM?

On 08/09/2023 07:36, Mikael Pesonen wrote:

We are using Ubuntu.

On Thu, 7 Sept 2023 at 16:33, Andy Seaborne  wrote:


Are the database files on a MS Windows filesystem?

There is a long-standing Java issue that memory mapped files on MS
Windows do not get freed until the JVM exists.

Various bugs in the OpenJDK bug database such as:

https://bugs.openjdk.org/browse/JDK-4715154

   Andy

On 07/09/2023 13:06, Mikael Pesonen wrote:

We used deleteOld param. The 50 gigs are ghost files that are
deleted
but not released, that's what I meant by hanging on deleted files.
Restarting jena releases them and now for example freed 50 gigs
of space.

On 07/09/2023 15.02, Øyvind Gjesdal wrote:

What does the content of the tdb2 folder look like?

I think compact by default never deletes the old data, but you have
parameters for making it delete the old content on completion.

`--deleteOld` can be supplied to the tdb2.tdbcompact command
line tool
and
`?deleteOld=true` can be supplied to the administration api when
calling
compact


https://jena.apache.org/documentation/fuseki2/fuseki-server-protocol.html#compact


You can also delete  the Data- that isn't the latest one in the
database folder.

Best regards,
Øyvind

On Thu, Sep 7, 2023 at 1:33 PM Mikael Pesonen

wrote:


After a while 25 gigs of files on data folder becomes 80 gigs
of disk
usage because Jena (4.6.1) doen't release files. Same with
compact. Is
this fixed in newer versions?



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Post big file with curl

2023-09-28 Thread Mikael Pesonen

This was too obvious. Thanks!

On 28/09/2023 14.13, Simon Bin wrote:

combine -T with -X POST

On Thu, 2023-09-28 at 13:11 +0300, Mikael Pesonen wrote:

How do you do that? -T is PUT and -data-binary loads entire file into
memory before sending so it's out of memory.



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Post big file with curl

2023-09-28 Thread Mikael Pesonen



How do you do that? -T is PUT and -data-binary loads entire file into 
memory before sending so it's out of memory.


Re: Jena hangs on deleted files

2023-09-13 Thread Mikael Pesonen
We have to wait for the next time (hope never) to get the complete lsof. 
It seems weird though why it would have meaning since killing/restarting 
Jena solves the issue.



On 12/09/2023 12.12, Rob @ DNR wrote:

Well, yes, there shouldn’t be, but that wasn’t what Andy suggested/asked.

Have you verified that nothing else is holding references to those files in any 
way e.g.

lsof | grep /path/to/your/db

And checked that only a single Java process is listed in the output?

We don’t know your deployment environment, it could be some mundane background 
process (e.g. anti-virus, search indexer) running on your system, it could be a 
bug in the particular JVM you are using, or something else entirely but without 
any more details we can only guess at possibilities

Another long shot is that it could be a hardware issue, if you’re running the 
database on an SSD it could be driver optimisation to not actually delete files 
until the holding process exits to avoid unnecessary write operations and 
prolong the life of the drive

Rob

From: Mikael Pesonen 
Date: Monday, 11 September 2023 at 12:17
To: users@jena.apache.org 
Subject: Re: Jena hangs on deleted files
There should not be other processes accessing the files. When jena is
restarted, space from deleted files is released.

On 09/09/2023 18.56, Andy Seaborne wrote:

This situation could be related to the other issues you're reported
(corrupted node tables) if some other Linux process,not nece3ssarily
java) is accessing the files.

A process holding them open will stop them becoming recyclable by the OS.

 Andy

On 08/09/2023 13:09, Mikael Pesonen wrote:

Just on a command line (dev system)

/usr/bin/java -Xmx8G -jar fuseki-server.jar --update --port 3030
--config=../jena_config/fuseki_config.ttl


On 08/09/2023 11.47, Andy Seaborne wrote:

In a container? As a VM?

On 08/09/2023 07:36, Mikael Pesonen wrote:

We are using Ubuntu.

On Thu, 7 Sept 2023 at 16:33, Andy Seaborne  wrote:


Are the database files on a MS Windows filesystem?

There is a long-standing Java issue that memory mapped files on MS
Windows do not get freed until the JVM exists.

Various bugs in the OpenJDK bug database such as:

https://bugs.openjdk.org/browse/JDK-4715154

   Andy

On 07/09/2023 13:06, Mikael Pesonen wrote:

We used deleteOld param. The 50 gigs are ghost files that are
deleted
but not released, that's what I meant by hanging on deleted files.
Restarting jena releases them and now for example freed 50 gigs
of space.

On 07/09/2023 15.02, Øyvind Gjesdal wrote:

What does the content of the tdb2 folder look like?

I think compact by default never deletes the old data, but you have
parameters for making it delete the old content on completion.

`--deleteOld` can be supplied to the tdb2.tdbcompact command
line tool
and
`?deleteOld=true` can be supplied to the administration api when
calling
compact


https://jena.apache.org/documentation/fuseki2/fuseki-server-protocol.html#compact


You can also delete  the Data- that isn't the latest one in the
database folder.

Best regards,
Øyvind

On Thu, Sep 7, 2023 at 1:33 PM Mikael Pesonen

wrote:


After a while 25 gigs of files on data folder becomes 80 gigs
of disk
usage because Jena (4.6.1) doen't release files. Same with
compact. Is
this fixed in newer versions?



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Jena hangs on deleted files

2023-09-11 Thread Mikael Pesonen
There should not be other processes accessing the files. When jena is 
restarted, space from deleted files is released.


On 09/09/2023 18.56, Andy Seaborne wrote:
This situation could be related to the other issues you're reported 
(corrupted node tables) if some other Linux process,not nece3ssarily 
java) is accessing the files.


A process holding them open will stop them becoming recyclable by the OS.

    Andy

On 08/09/2023 13:09, Mikael Pesonen wrote:

Just on a command line (dev system)

/usr/bin/java -Xmx8G -jar fuseki-server.jar --update --port 3030 
--config=../jena_config/fuseki_config.ttl



On 08/09/2023 11.47, Andy Seaborne wrote:

In a container? As a VM?

On 08/09/2023 07:36, Mikael Pesonen wrote:

We are using Ubuntu.

On Thu, 7 Sept 2023 at 16:33, Andy Seaborne  wrote:


Are the database files on a MS Windows filesystem?

There is a long-standing Java issue that memory mapped files on MS
Windows do not get freed until the JVM exists.

Various bugs in the OpenJDK bug database such as:

https://bugs.openjdk.org/browse/JDK-4715154

  Andy

On 07/09/2023 13:06, Mikael Pesonen wrote:


We used deleteOld param. The 50 gigs are ghost files that are 
deleted

but not released, that's what I meant by hanging on deleted files.
Restarting jena releases them and now for example freed 50 gigs 
of space.


On 07/09/2023 15.02, Øyvind Gjesdal wrote:

What does the content of the tdb2 folder look like?

I think compact by default never deletes the old data, but you have
parameters for making it delete the old content on completion.

`--deleteOld` can be supplied to the tdb2.tdbcompact command 
line tool

and
`?deleteOld=true` can be supplied to the administration api when 
calling

compact

https://jena.apache.org/documentation/fuseki2/fuseki-server-protocol.html#compact 



You can also delete  the Data- that isn't the latest one in the
database folder.

Best regards,
Øyvind

On Thu, Sep 7, 2023 at 1:33 PM Mikael Pesonen

wrote:

After a while 25 gigs of files on data folder becomes 80 gigs 
of disk
usage because Jena (4.6.1) doen't release files. Same with 
compact. Is

this fixed in newer versions?













Re: Jena hangs on deleted files

2023-09-08 Thread Mikael Pesonen

Just on a command line (dev system)

/usr/bin/java -Xmx8G -jar fuseki-server.jar --update --port 3030 
--config=../jena_config/fuseki_config.ttl



On 08/09/2023 11.47, Andy Seaborne wrote:

In a container? As a VM?

On 08/09/2023 07:36, Mikael Pesonen wrote:

We are using Ubuntu.

On Thu, 7 Sept 2023 at 16:33, Andy Seaborne  wrote:


Are the database files on a MS Windows filesystem?

There is a long-standing Java issue that memory mapped files on MS
Windows do not get freed until the JVM exists.

Various bugs in the OpenJDK bug database such as:

https://bugs.openjdk.org/browse/JDK-4715154

  Andy

On 07/09/2023 13:06, Mikael Pesonen wrote:


We used deleteOld param. The 50 gigs are ghost files that are deleted
but not released, that's what I meant by hanging on deleted files.
Restarting jena releases them and now for example freed 50 gigs of 
space.


On 07/09/2023 15.02, Øyvind Gjesdal wrote:

What does the content of the tdb2 folder look like?

I think compact by default never deletes the old data, but you have
parameters for making it delete the old content on completion.

`--deleteOld` can be supplied to the tdb2.tdbcompact command line 
tool

and
`?deleteOld=true` can be supplied to the administration api when 
calling

compact

https://jena.apache.org/documentation/fuseki2/fuseki-server-protocol.html#compact 



You can also delete  the Data- that isn't the latest one in the
database folder.

Best regards,
Øyvind

On Thu, Sep 7, 2023 at 1:33 PM Mikael Pesonen

wrote:

After a while 25 gigs of files on data folder becomes 80 gigs of 
disk
usage because Jena (4.6.1) doen't release files. Same with 
compact. Is

this fixed in newer versions?









--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Jena hangs on deleted files

2023-09-08 Thread Mikael Pesonen
We are using Ubuntu.

On Thu, 7 Sept 2023 at 16:33, Andy Seaborne  wrote:

> Are the database files on a MS Windows filesystem?
>
> There is a long-standing Java issue that memory mapped files on MS
> Windows do not get freed until the JVM exists.
>
> Various bugs in the OpenJDK bug database such as:
>
> https://bugs.openjdk.org/browse/JDK-4715154
>
>  Andy
>
> On 07/09/2023 13:06, Mikael Pesonen wrote:
> >
> > We used deleteOld param. The 50 gigs are ghost files that are deleted
> > but not released, that's what I meant by hanging on deleted files.
> > Restarting jena releases them and now for example freed 50 gigs of space.
> >
> > On 07/09/2023 15.02, Øyvind Gjesdal wrote:
> >> What does the content of the tdb2 folder look like?
> >>
> >> I think compact by default never deletes the old data, but you have
> >> parameters for making it delete the old content on completion.
> >>
> >> `--deleteOld` can be supplied to the tdb2.tdbcompact command line tool
> >> and
> >> `?deleteOld=true` can be supplied to the administration api when calling
> >> compact
> >>
> https://jena.apache.org/documentation/fuseki2/fuseki-server-protocol.html#compact
> >>
> >> You can also delete  the Data- that isn't the latest one in the
> >> database folder.
> >>
> >> Best regards,
> >> Øyvind
> >>
> >> On Thu, Sep 7, 2023 at 1:33 PM Mikael Pesonen
> >> 
> >> wrote:
> >>
> >>> After a while 25 gigs of files on data folder becomes 80 gigs of disk
> >>> usage because Jena (4.6.1) doen't release files. Same with compact. Is
> >>> this fixed in newer versions?
> >>>
> >
>


Re: Jena hangs on deleted files

2023-09-07 Thread Mikael Pesonen



We used deleteOld param. The 50 gigs are ghost files that are deleted 
but not released, that's what I meant by hanging on deleted files. 
Restarting jena releases them and now for example freed 50 gigs of space.


On 07/09/2023 15.02, Øyvind Gjesdal wrote:

What does the content of the tdb2 folder look like?

I think compact by default never deletes the old data, but you have
parameters for making it delete the old content on completion.

`--deleteOld` can be supplied to the tdb2.tdbcompact command line tool and
`?deleteOld=true` can be supplied to the administration api when calling
compact
https://jena.apache.org/documentation/fuseki2/fuseki-server-protocol.html#compact

You can also delete  the Data- that isn't the latest one in the
database folder.

Best regards,
Øyvind

On Thu, Sep 7, 2023 at 1:33 PM Mikael Pesonen 
wrote:


After a while 25 gigs of files on data folder becomes 80 gigs of disk
usage because Jena (4.6.1) doen't release files. Same with compact. Is
this fixed in newer versions?



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Jena hangs on deleted files

2023-09-07 Thread Mikael Pesonen



After a while 25 gigs of files on data folder becomes 80 gigs of disk 
usage because Jena (4.6.1) doen't release files. Same with compact. Is 
this fixed in newer versions?


Re: NodeTableTRDF/Read exception

2023-06-19 Thread Mikael Pesonen
]: at 
org.apache.jena.sparql.lang.ParserARQUpdate.executeParse(ParserARQUpdate.java:42) 
~[fuseki-server.jar:4.6.1]
Jun 19 11:35:22 xxx.fi java[1047205]: at 
org.apache.jena.sparql.lang.UpdateParser.parse(UpdateParser.java:53) 
~[fuseki-server.jar:4.6.1]
Jun 19 11:35:22 xxx.fi java[1047205]: at 
org.apache.jena.update.UpdateAction.parseExecute(UpdateAction.java:423) 
~[fuseki-server.jar:4.6.1]
Jun 19 11:35:22 xxx.fi java[1047205]: at 
org.apache.jena.update.UpdateAction.parseExecute(UpdateAction.java:381) 
~[fuseki-server.jar:4.6.1]
Jun 19 11:35:22 xxx.fi java[1047205]: at 
org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:229) 
~[fuseki-server.jar:4.6.1]
Jun 19 11:35:22 xxx.fi java[1047205]: [2023-06-19 11:35:22] Fuseki 
INFO  [34] 500 Server Error (30 ms)



On 19/06/2023 12.45, Mikael Pesonen wrote:
I deleted jena data folder, restarted db and started reinputting data. 
Still after a while I get NodeTableTRDF/Read. Any idea?


On 21/03/2022 20.43, Andy Seaborne wrote:
The only time I have seen anything similar to this is on Android 
where something other process is messing about the files. TDB is not 
proof again other processes accessing the same files, including with 
shared network drives where different computers access the same 
filesystem.


    Andy

On 21/03/2022 11:39, Mikael Pesonen wrote:


Got this again after few days of little usage after TDB2 was rebuilt 
from empty. Would you suggest this is hw error? No possibility that 
its Jena error?


On 28/05/2021 17.25, Andy Seaborne wrote:



On 28/05/2021 14:59, Mikael Pesonen wrote:


I should try some older Jena/Fuseki version?


Yes.

Also
 - run on different hardware.
 - run multiple times
 - look at the data and see if anything unusual is in it.
 etc etc




On 28/05/2021 16.49, Andy Seaborne wrote:



On 28/05/2021 14:03, Mikael Pesonen wrote:


No this is the fresh db, started from empty today. And plenty of 
disk space.


So it's repeatble.

With no Minimal, Verifiable, Complete Example, it'll have to be 
an on-site investigation. Try different versions.


    Andy



On 28/05/2021 15.58, Andy Seaborne wrote:

Why are you adding data to a broken database?

On 28/05/2021 12:02, Mikael Pesonen wrote:


Actually now it happened again. Same size, about 80MB of 
turtle, imported without warnings this time, but reading the 
graph fails with this exception.


13:59:39 WARN  Fuseki  :: [44] RC = 500 : 
NodeTableTRDF/Read

org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:113) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter.next(Iter.java:1072) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:47) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:31) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:145

Re: NodeTableTRDF/Read exception

2023-06-19 Thread Mikael Pesonen
I deleted jena data folder, restarted db and started reinputting data. 
Still after a while I get NodeTableTRDF/Read. Any idea?


On 21/03/2022 20.43, Andy Seaborne wrote:
The only time I have seen anything similar to this is on Android where 
something other process is messing about the files. TDB is not proof 
again other processes accessing the same files, including with shared 
network drives where different computers access the same filesystem.


    Andy

On 21/03/2022 11:39, Mikael Pesonen wrote:


Got this again after few days of little usage after TDB2 was rebuilt 
from empty. Would you suggest this is hw error? No possibility that 
its Jena error?


On 28/05/2021 17.25, Andy Seaborne wrote:



On 28/05/2021 14:59, Mikael Pesonen wrote:


I should try some older Jena/Fuseki version?


Yes.

Also
 - run on different hardware.
 - run multiple times
 - look at the data and see if anything unusual is in it.
 etc etc




On 28/05/2021 16.49, Andy Seaborne wrote:



On 28/05/2021 14:03, Mikael Pesonen wrote:


No this is the fresh db, started from empty today. And plenty of 
disk space.


So it's repeatble.

With no Minimal, Verifiable, Complete Example, it'll have to be an 
on-site investigation. Try different versions.


    Andy



On 28/05/2021 15.58, Andy Seaborne wrote:

Why are you adding data to a broken database?

On 28/05/2021 12:02, Mikael Pesonen wrote:


Actually now it happened again. Same size, about 80MB of 
turtle, imported without warnings this time, but reading the 
graph fails with this exception.


13:59:39 WARN  Fuseki  :: [44] RC = 500 : 
NodeTableTRDF/Read

org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:113) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter.next(Iter.java:1072) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:47) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:31) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:145) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:74) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterBlockTriplesStar.hasNextBinding(QueryIterBlockTriplesStar.java:54) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:101) 
~[fuseki

Re: NodeTableTRDF/Read exception

2023-05-15 Thread Mikael Pesonen

Normal updates work but this comes with clear graph.

On 12/05/2023 14.58, Mikael Pesonen wrote:
Got this NodeTableTRDF/Read again in our production environment. I got 
message from admin that Jena had unusually high memory consumption and 
it suddenly normalized yesterday. Maybe it could be related to this.


On 21/03/2022 20.43, Andy Seaborne wrote:
The only time I have seen anything similar to this is on Android 
where something other process is messing about the files. TDB is not 
proof again other processes accessing the same files, including with 
shared network drives where different computers access the same 
filesystem.


    Andy

On 21/03/2022 11:39, Mikael Pesonen wrote:


Got this again after few days of little usage after TDB2 was rebuilt 
from empty. Would you suggest this is hw error? No possibility that 
its Jena error?


On 28/05/2021 17.25, Andy Seaborne wrote:



On 28/05/2021 14:59, Mikael Pesonen wrote:


I should try some older Jena/Fuseki version?


Yes.

Also
 - run on different hardware.
 - run multiple times
 - look at the data and see if anything unusual is in it.
 etc etc




On 28/05/2021 16.49, Andy Seaborne wrote:



On 28/05/2021 14:03, Mikael Pesonen wrote:


No this is the fresh db, started from empty today. And plenty of 
disk space.


So it's repeatble.

With no Minimal, Verifiable, Complete Example, it'll have to be 
an on-site investigation. Try different versions.


    Andy



On 28/05/2021 15.58, Andy Seaborne wrote:

Why are you adding data to a broken database?

On 28/05/2021 12:02, Mikael Pesonen wrote:


Actually now it happened again. Same size, about 80MB of 
turtle, imported without warnings this time, but reading the 
graph fails with this exception.


13:59:39 WARN  Fuseki  :: [44] RC = 500 : 
NodeTableTRDF/Read

org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:113) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter.next(Iter.java:1072) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:47) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:31) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:145) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:74) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterBlockTriplesStar.hasNextBinding(QueryIterBlockTriplesStar.java:54) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext

Re: NodeTableTRDF/Read exception

2023-05-12 Thread Mikael Pesonen
Got this NodeTableTRDF/Read again in our production environment. I got 
message from admin that Jena had unusually high memory consumption and 
it suddenly normalized yesterday. Maybe it could be related to this.


On 21/03/2022 20.43, Andy Seaborne wrote:
The only time I have seen anything similar to this is on Android where 
something other process is messing about the files. TDB is not proof 
again other processes accessing the same files, including with shared 
network drives where different computers access the same filesystem.


    Andy

On 21/03/2022 11:39, Mikael Pesonen wrote:


Got this again after few days of little usage after TDB2 was rebuilt 
from empty. Would you suggest this is hw error? No possibility that 
its Jena error?


On 28/05/2021 17.25, Andy Seaborne wrote:



On 28/05/2021 14:59, Mikael Pesonen wrote:


I should try some older Jena/Fuseki version?


Yes.

Also
 - run on different hardware.
 - run multiple times
 - look at the data and see if anything unusual is in it.
 etc etc




On 28/05/2021 16.49, Andy Seaborne wrote:



On 28/05/2021 14:03, Mikael Pesonen wrote:


No this is the fresh db, started from empty today. And plenty of 
disk space.


So it's repeatble.

With no Minimal, Verifiable, Complete Example, it'll have to be an 
on-site investigation. Try different versions.


    Andy



On 28/05/2021 15.58, Andy Seaborne wrote:

Why are you adding data to a broken database?

On 28/05/2021 12:02, Mikael Pesonen wrote:


Actually now it happened again. Same size, about 80MB of 
turtle, imported without warnings this time, but reading the 
graph fails with this exception.


13:59:39 WARN  Fuseki  :: [44] RC = 500 : 
NodeTableTRDF/Read

org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:113) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter.next(Iter.java:1072) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:47) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:31) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:145) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:74) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterBlockTriplesStar.hasNextBinding(QueryIterBlockTriplesStar.java:54) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0

Re: Combine two columns in SPARQL

2023-04-27 Thread Mikael Pesonen



I don't know what has happened but now it works. I had the old query in 
Fuseki GUI so I'm sure I didn't change anything.


On 26/04/2023 9.19, Lorenz Buehmann wrote:

I cannot reproduce it:


Data (test.ttl):

|PREFIX : <https://example.org/foo#>||
||PREFIX skos: <http://www.w3.org/2004/02/skos/core#>||
||
||:Animals||
||  skos:prefLabel "animals"@en ;||
||  skos:altLabel "fauna"@en ;||
||  skos:hiddenLabel "aminals"@en ;||
||  skos:prefLabel "animaux"@fr ;||
||  skos:altLabel "faune"@fr .||
|

Query (test.rq):


|PREFIX skos: <http://www.w3.org/2004/02/skos/core#>||
||
||select ?s ?pl_al||
||where {||
||    ?s skos:prefLabel ?pl .||
||    ?s skos:altLabel ?al .||
||    bind(concat(?pl, ?al) as ?pl_al)||
||  }|


Usage (Jena CLI):

sparql --data test.ttl --query test.rq

Result:

-
| s |    | pl_al |||
||=||
||| <https://example.org/foo#Animals> | "animauxfauna" |||
||| <https://example.org/foo#Animals> | "animauxfaune"@fr |||
||| <https://example.org/foo#Animals> | "animalsfauna"@en |||
||| <https://example.org/foo#Animals> | "animalsfaune" |||
||-----|


Do you have named graphs or something? I mean, is just one column 
empty or the whole resultset?




On 24.04.23 14:18, Mikael Pesonen wrote:


Not Jena question but hope someone can help. I have two columns with 
always equal amount of rows. How can they be combined into one column 
(variable)? This method doesn't work (example has different predicates):


select ?s ?pl_al
where {
    ?s skos:prefLabel ?pl .
    ?s skos:altLabel ?al .
    bind(concat(?pl, ?al) as ?pl_al)
  }






Re: Combine two columns in SPARQL

2023-04-24 Thread Mikael Pesonen

Thanks for testing it.

So wonder what it causing the empty values on my case.

On 24/04/2023 16.42, James Anderson wrote:

that would matter.
i make a dataset with strings:

   https://dydra.com/test/test/first_10_types.html


On 24. Apr 2023, at 15:22, Mikael Pesonen  wrote:

Okay so it should work? I'm getting empty on Jena 4.6.1.

Also tried

bind(concat(str(?pl), str(?al)) as ?pl_al)

just in case.


On 24/04/2023 16.16, James Anderson wrote:

good afternoon;


On 24. Apr 2023, at 14:18, Mikael Pesonen  wrote:


Not Jena question but hope someone can help. I have two columns with always 
equal amount of rows. How can they be combined into one column (variable)? This 
method doesn't work (example has different predicates):

select ?s ?pl_al
where {
 ?s skos:prefLabel ?pl .
 ?s skos:altLabel ?al .
 bind(concat(?pl, ?al) as ?pl_al)
   }

what do you intend, which this does not yield:

 https://dydra.com/test/test/columns.html


---
james anderson | ja...@dydra.com | https://dydra.com



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND


---
james anderson | ja...@dydra.com | https://dydra.com




--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Combine two columns in SPARQL

2023-04-24 Thread Mikael Pesonen

Okay so it should work? I'm getting empty on Jena 4.6.1.

Also tried

bind(concat(str(?pl), str(?al)) as ?pl_al)

just in case.


On 24/04/2023 16.16, James Anderson wrote:

good afternoon;


On 24. Apr 2023, at 14:18, Mikael Pesonen  wrote:


Not Jena question but hope someone can help. I have two columns with always 
equal amount of rows. How can they be combined into one column (variable)? This 
method doesn't work (example has different predicates):

select ?s ?pl_al
where {
 ?s skos:prefLabel ?pl .
 ?s skos:altLabel ?al .
 bind(concat(?pl, ?al) as ?pl_al)
   }

what do you intend, which this does not yield:

 https://dydra.com/test/test/columns.html


---
james anderson | ja...@dydra.com | https://dydra.com




--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Combine two columns in SPARQL

2023-04-24 Thread Mikael Pesonen



Not Jena question but hope someone can help. I have two columns with 
always equal amount of rows. How can they be combined into one column 
(variable)? This method doesn't work (example has different predicates):


select ?s ?pl_al
where {
    ?s skos:prefLabel ?pl .
    ?s skos:altLabel ?al .
    bind(concat(?pl, ?al) as ?pl_al)
  }


Re: Server error with large truncated log

2023-04-24 Thread Mikael Pesonen



Do you have any idea what could mess up the log, or how to fix it?

On 22/04/2023 15.57, Andy Seaborne wrote:



On 20/04/2023 13:55, Mikael Pesonen wrote:
Removing fi section results same error and also when removing 
OPTIONAL from en too. Query with all three languages work without 
OPTIONALs but that doesn't capture graphs (?ls_id) without some 
language content (returns only graphs that have all three languages).


https://gist.github.com/mikael1234/a0ed6b4947d392b5798ca29cdab69b1f


...
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1038) 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:144)

...

TProtocolUtil.skip does not call Jetty. Thg ejetty line looks like the 
start of a different stacktrace.


That seems to be the overlap of several log outputs with loss of details.

It does not have the beginning of any of the exceptions which is where 
the exception message is.


    Andy



On 20/04/2023 15.36, Andy Seaborne wrote:



On 20/04/2023 13:23, Mikael Pesonen wrote:


I have a query with some counts for statistics (only way I got them 
working)


  { SELECT ?ls_id (count(distinct ?pl_fi) as ?fi_count) WHERE { 
GRAPH ?ls_id {

   ?c skos:prefLabel ?pl_fi FILTER(LANG(?pl_fi) = "fi")
 }}  GROUP BY ?ls_id   }

 OPTIONAL {
 { SELECT ?ls_id (count(distinct ?pl_en) as ?en_count)WHERE { 
GRAPH ?ls_id {

   ?c skos:prefLabel ?pl_en FILTER(LANG(?pl_en) = "en")
 }}  GROUP BY ?ls_id   }
 }

 OPTIONAL {
 { SELECT ?ls_id (count(distinct ?pl_sv) as ?sv_count)WHERE { 
GRAPH ?ls_id {

   ?c skos:prefLabel ?pl_sv FILTER(LANG(?pl_sv) = "sv")
 }}  GROUP BY ?ls_id   }
 }

With en and fi languages this works, but adding sv it fails with 
Server Error and 12000+ lines of log for one exception.


Does en and sv work?

Log is truncated so beginning of the log is missing. How could I 
proceed with debugging this? Jena is 4.6.1 and I'm running queries 
in Fuseki web GUI.


Last lines of exception log:


No Jena code there.

What you are looking is the top of the stacktrace (back to the 
operation in Jena) and also the top of the "caused by"


IF they are all the same, put one example on a gist.

    Andy

Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:497) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:282) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:319) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:412) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:381) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:268) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$new$0(AdaptiveExecutionStrategy.java:138) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:407) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:894) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1038) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: [2023-04-20 15:08:40] 
Fuseki INFO  [6607] 500 Server Error (40.263 s)
Apr 20 15:08:40 x.lingsoft.fi java[832674]: [2023-04-20 15:08:40] 
Fuseki INFO  [6606] 500 Server Error (40.279 s)






--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10

Re: Server error with large truncated log

2023-04-21 Thread Mikael Pesonen
I forgot to mention that the query is done as federated call. Local call 
results an server error but bit more detail:


check(??0, null): null node value



On 20/04/2023 15.36, Andy Seaborne wrote:



On 20/04/2023 13:23, Mikael Pesonen wrote:


I have a query with some counts for statistics (only way I got them 
working)


  { SELECT ?ls_id (count(distinct ?pl_fi) as ?fi_count) WHERE { 
GRAPH ?ls_id {

   ?c skos:prefLabel ?pl_fi FILTER(LANG(?pl_fi) = "fi")
 }}  GROUP BY ?ls_id   }

 OPTIONAL {
 { SELECT ?ls_id (count(distinct ?pl_en) as ?en_count)WHERE { 
GRAPH ?ls_id {

   ?c skos:prefLabel ?pl_en FILTER(LANG(?pl_en) = "en")
 }}  GROUP BY ?ls_id   }
 }

 OPTIONAL {
 { SELECT ?ls_id (count(distinct ?pl_sv) as ?sv_count)WHERE { 
GRAPH ?ls_id {

   ?c skos:prefLabel ?pl_sv FILTER(LANG(?pl_sv) = "sv")
 }}  GROUP BY ?ls_id   }
 }

With en and fi languages this works, but adding sv it fails with 
Server Error and 12000+ lines of log for one exception.


Does en and sv work?

Log is truncated so beginning of the log is missing. How could I 
proceed with debugging this? Jena is 4.6.1 and I'm running queries in 
Fuseki web GUI.


Last lines of exception log:


No Jena code there.

What you are looking is the top of the stacktrace (back to the 
operation in Jena) and also the top of the "caused by"


IF they are all the same, put one example on a gist.

    Andy

Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:497) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:282) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:319) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:412) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:381) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:268) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$new$0(AdaptiveExecutionStrategy.java:138) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:407) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:894) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1038) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: [2023-04-20 15:08:40] 
Fuseki INFO  [6607] 500 Server Error (40.263 s)
Apr 20 15:08:40 x.lingsoft.fi java[832674]: [2023-04-20 15:08:40] 
Fuseki INFO  [6606] 500 Server Error (40.279 s)




--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail:mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND


Re: Server error with large truncated log

2023-04-20 Thread Mikael Pesonen
Removing fi section results same error and also when removing OPTIONAL 
from en too. Query with all three languages work without OPTIONALs but 
that doesn't capture graphs (?ls_id) without some language content 
(returns only graphs that have all three languages).


https://gist.github.com/mikael1234/a0ed6b4947d392b5798ca29cdab69b1f

On 20/04/2023 15.36, Andy Seaborne wrote:



On 20/04/2023 13:23, Mikael Pesonen wrote:


I have a query with some counts for statistics (only way I got them 
working)


  { SELECT ?ls_id (count(distinct ?pl_fi) as ?fi_count) WHERE { 
GRAPH ?ls_id {

   ?c skos:prefLabel ?pl_fi FILTER(LANG(?pl_fi) = "fi")
 }}  GROUP BY ?ls_id   }

 OPTIONAL {
 { SELECT ?ls_id (count(distinct ?pl_en) as ?en_count)WHERE { 
GRAPH ?ls_id {

   ?c skos:prefLabel ?pl_en FILTER(LANG(?pl_en) = "en")
 }}  GROUP BY ?ls_id   }
 }

 OPTIONAL {
 { SELECT ?ls_id (count(distinct ?pl_sv) as ?sv_count)WHERE { 
GRAPH ?ls_id {

   ?c skos:prefLabel ?pl_sv FILTER(LANG(?pl_sv) = "sv")
 }}  GROUP BY ?ls_id   }
 }

With en and fi languages this works, but adding sv it fails with 
Server Error and 12000+ lines of log for one exception.


Does en and sv work?

Log is truncated so beginning of the log is missing. How could I 
proceed with debugging this? Jena is 4.6.1 and I'm running queries in 
Fuseki web GUI.


Last lines of exception log:


No Jena code there.

What you are looking is the top of the stacktrace (back to the 
operation in Jena) and also the top of the "caused by"


IF they are all the same, put one example on a gist.

    Andy

Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:497) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:282) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:319) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:412) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:381) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:268) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$new$0(AdaptiveExecutionStrategy.java:138) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:407) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:894) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1038) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: [2023-04-20 15:08:40] 
Fuseki INFO  [6607] 500 Server Error (40.263 s)
Apr 20 15:08:40 x.lingsoft.fi java[832674]: [2023-04-20 15:08:40] 
Fuseki INFO  [6606] 500 Server Error (40.279 s)




--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Server error with large truncated log

2023-04-20 Thread Mikael Pesonen



I have a query with some counts for statistics (only way I got them working)

 { SELECT ?ls_id (count(distinct ?pl_fi) as ?fi_count) WHERE { 
GRAPH ?ls_id {

  ?c skos:prefLabel ?pl_fi FILTER(LANG(?pl_fi) = "fi")
    }}  GROUP BY ?ls_id   }

    OPTIONAL {
    { SELECT ?ls_id (count(distinct ?pl_en) as ?en_count)WHERE { GRAPH 
?ls_id {

  ?c skos:prefLabel ?pl_en FILTER(LANG(?pl_en) = "en")
    }}  GROUP BY ?ls_id   }
    }

    OPTIONAL {
    { SELECT ?ls_id (count(distinct ?pl_sv) as ?sv_count)WHERE { GRAPH 
?ls_id {

  ?c skos:prefLabel ?pl_sv FILTER(LANG(?pl_sv) = "sv")
    }}  GROUP BY ?ls_id   }
    }

With en and fi languages this works, but adding sv it fails with Server 
Error and 12000+ lines of log for one exception. Log is truncated so 
beginning of the log is missing. How could I proceed with debugging 
this? Jena is 4.6.1 and I'm running queries in Fuseki web GUI.


Last lines of exception log:

Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:497) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:282) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:319) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:412) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:381) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:268) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$new$0(AdaptiveExecutionStrategy.java:138) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:407) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:894) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1038) 
~[fuseki-server.jar:4.6.1]
Apr 20 15:08:40 x.lingsoft.fi java[832674]: [2023-04-20 15:08:40] 
Fuseki INFO  [6607] 500 Server Error (40.263 s)
Apr 20 15:08:40 x.lingsoft.fi java[832674]: [2023-04-20 15:08:40] 
Fuseki INFO  [6606] 500 Server Error (40.279 s)




Re: Strategies to avoid log flooding

2023-03-29 Thread Mikael Pesonen



Here the next line was REPLACE that's why regex

VALUES ?class_label { " \\(häiriö\\)" " \\(löydös\\)" " \\(toimenpide\\)" }
?concept rdfs:label ?fsnl FILTER (REGEX(?fsnl, $class_label)) .
BIND (REPLACE(?fsnl, ?class_label, "") AS ?newl) .

But indeed, I didn't mean to use $ in $class_label, no idea what that 
syntax means. But it was not the cause here?


So to use constants, write above like this?

?concept rdfs:label ?fsnl
FILTER (REGEX(?fsnl, " \\(häiriö\\)") | REGEX(?fsnl, " \\(löydös\\)") | 
REGEX(?fsnl, " \\(toimenpide\\)") ) .

BIND (REPLACE(?fsnl, " \\(häiriö\\)", "") AS ?newl1) .
BIND (REPLACE(?newl1, " \\(löydös\\)", "") AS ?newl2) .
BIND (REPLACE(?newl2, " \\(toimenpide\\)", "") AS ?newl) .


On 29/03/2023 15.20, Andy Seaborne wrote:



On 29/03/2023 12:56, Rob @ DNR wrote:
Yes, you can filter these out, the logger in question is the class 
name shown, the log4j configuration will need to reference that via 
its fully qualified name i.e. 
org.apache.jena.sparql.engine.iterator.QueryIterFilterExpr and set it 
to ERROR/OFF to suppress these warnings


Issuing millions of instances of the same identical warning certainly 
seems like a bug to me, especially since this is elicited by query 
input it could potentially be abused as a DoS attack vector.


Rob


From: Mikael Pesonen 
Date: Wednesday, 29 March 2023 at 10:22
To: users@jena.apache.org 
Subject: Re: Strategies to avoid log flooding
Below is the log, so is it possible to filter just these out?

Unfortunately I don't recall the exact regex but it was related to
escaping parentheses, so maybe this or with one back slash:
...



VALUES ?class_label { " \\(häiriö\\)" " \\(löydös\\)" " 
\\(toimenpide\\)" }

?concept rdfs:label ?fsnl FILTER (REGEX(?fsnl, $class_label)) .


That does not align with the log message which says the pattern is " 
\\(häiriö\\)"@fi


meaning $class_label is @fi.

Use str() to get the lexical part.

The regex is potentially different every call. So the regex is 
compiled every call. (If it's the same, a constant, it is compiled once.)


Here, write as three calls, one per constant.

Or use CONTAINS, because a regex is unnecessary in this case.

    Andy


...

So this is a bug not a feature and can be corrected?

Mar 27 13:13:33 insight-terms java[2512289]: [2023-03-27 13:13:33]
QueryIterFilterExpr WARN  Expression Exception in (regex ?fsnl 
?class_label)

Mar 27 13:13:33 insight-terms java[2512289]:
org.apache.jena.sparql.expr.ExprException: REGEX: Pattern is not a
string: " \\(häiriö\\)"@fi
Mar 27 13:13:33 insight-terms java[2512289]: #011at
org.apache.jena.sparql.expr.E_Regex.makeRegexEngine(E_Regex.java:120)
~[fuseki-server.jar:4.6.1]


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Strategies to avoid log flooding

2023-03-29 Thread Mikael Pesonen
ileview.texmex_20230316.01_p2


On 28/03/2023 16.04, Rob @ DNR wrote:

A GitHub issue with a minimal example query that reproduces the issue would be 
a good start so we can reproduce the issue and look into a fix

In workaround terms end users control their logging configuration so you could 
create a Log4j configuration that disables logging for the specific offending 
logger (assuming that this is a sufficiently specific logger to not suppress 
actually relevant logging)

Rob

From: Mikael Pesonen 
Date: Tuesday, 28 March 2023 at 11:21
To: users@jena.apache.org 
Subject: Strategies to avoid log flooding
Hi,

there are some cases where Jena generates dozens of gigs, maybe even
terabytes, of log in one query. If you add a bad REGEX, it generates a
long warning level exception for every row in db, or atleast million of
them (disk filled up so don't know). Is there another way to avoid this
except disable warnings?



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Strategies to avoid log flooding

2023-03-28 Thread Mikael Pesonen

Hi,

there are some cases where Jena generates dozens of gigs, maybe even 
terabytes, of log in one query. If you add a bad REGEX, it generates a 
long warning level exception for every row in db, or atleast million of 
them (disk filled up so don't know). Is there another way to avoid this 
except disable warnings?


Re: Minus in shortened URL

2023-02-16 Thread Mikael Pesonen
Thanks!

On Wed, 15 Feb 2023 at 21:08, Andy Seaborne  wrote:

>
>
> On 15/02/2023 14:05, Mikael Pesonen wrote:
> >
> > This works:
> > <https://www.example.com/-30>
> >
> > but this results error Expected IRI for predicate: got: [INTEGER:-30]:
> >
> > @prefix id: <https://www.example.com/> .
> > id:-30 a skos:Concept
> >
> > Latter should be legal?
>
> No.
>
> https://www.w3.org/TR/turtle/#grammar-production-PN_LOCAL
>
> Minus can not be the first character.
>
> use an escape:
>
> id:\-30 a skos:Concept
>


Minus in shortened URL

2023-02-15 Thread Mikael Pesonen



This works:


but this results error Expected IRI for predicate: got: [INTEGER:-30]:

@prefix id:  .
id:-30 a skos:Concept

Latter should be legal?


Re: How to handle optional lists in SPARQL

2022-12-21 Thread Mikael Pesonen

Hi, thanks for the suggestion! I didn't think of property paths.

So in case of just optional list  /list:member? works fine. And for my 
more complicated case as you wrote.




On 12/12/2022 17.08, Lorenz Buehmann wrote:
I don't know your full query restrictions, but without given 
properties it would be a "simple" property path, no? Something like


owl:someValuesFrom/((owl:intersectionOf|owl:unionOf)/list:member)?

where the list closure is optional and if you want to make it nestested

(owl:someValuesFrom/((owl:intersectionOf|owl:unionOf)/list:member)?)*

So a query like

prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix owl: <http://www.w3.org/2002/07/owl#>
prefix list: <http://jena.apache.org/ARQ/list#>

select * {
?subclass rdfs:subClassOf|owl:equivalentClass 
[(owl:intersectionOf|owl:unionOf)/list:member/(owl:someValuesFrom/((owl:intersectionOf|owl:unionOf)/list:member)?)* 
?m]

FILTER(isIRI(?m))
}

could work. You could even try to make it more generic.


But maybe you have different requirements, in that case it would be 
easier to help with sample data. My sample data now was


@prefix : 
<http://www.semanticweb.org/user/ontologies/2022/11/untitled-ontology-100#> 
.

@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

:Foo rdf:type owl:Class ;
 owl:equivalentClass [ rdf:type owl:Class ;
   owl:unionOf ( :A
 [ rdf:type owl:Restriction ;
   owl:onProperty :p ;
   owl:someValuesFrom [ 
rdf:type owl:Class ;

owl:unionOf ( :B
[ rdf:type owl:Restriction ;
owl:onProperty :q ;
owl:someValuesFrom :C
]
)
  ]
 ]
   )
             ] .


Cheers,

Lorenz

On 12.12.22 13:49, Mikael Pesonen wrote:


Is there a shortcut for making queries where a data value can be 
single item or list of items?


For example this is how I do a query now using UNION. Both parts are 
identical except for the single/list section in owl:someValuesFrom 
[]. This is still somewhat readable but if there are multiple 
occurences, query lenght and complexity grows exponentially.


{
    ?finding owl:equivalentClass|rdfs:subClassOf [
        owl:intersectionOf [
            list:member [
                rdf:type owl:Restriction ;
                owl:onProperty id:609096000 ;
                owl:someValuesFrom [
                    rdf:type owl:Restriction ;
                    owl:onProperty id:363698007 ;
                    owl:someValuesFrom ?site
                ]
            ]
        ]
    ]
    }
    UNION
    {
    ?finding owl:equivalentClass|rdfs:subClassOf [
        owl:intersectionOf [
            list:member [
                rdf:type owl:Restriction ;
                owl:onProperty id:609096000 ;
                owl:someValuesFrom [
                    owl:intersectionOf [
                        list:member [
                            rdf:type owl:Restriction ;
                            owl:onProperty id:363698007 ;
                            owl:someValuesFrom ?site
                        ]
                    ]
                ]
            ]
        ]
    ]
    }

The data is not ours so we can't make everything lists.


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



How to handle optional lists in SPARQL

2022-12-12 Thread Mikael Pesonen



Is there a shortcut for making queries where a data value can be single 
item or list of items?


For example this is how I do a query now using UNION. Both parts are 
identical except for the single/list section in owl:someValuesFrom []. 
This is still somewhat readable but if there are multiple occurences, 
query lenght and complexity grows exponentially.


{
    ?finding owl:equivalentClass|rdfs:subClassOf [
        owl:intersectionOf [
            list:member [
                rdf:type owl:Restriction ;
                owl:onProperty id:609096000 ;
                owl:someValuesFrom [
                    rdf:type owl:Restriction ;
                    owl:onProperty id:363698007 ;
                    owl:someValuesFrom ?site
                ]
            ]
        ]
    ]
    }
    UNION
    {
    ?finding owl:equivalentClass|rdfs:subClassOf [
        owl:intersectionOf [
            list:member [
                rdf:type owl:Restriction ;
                owl:onProperty id:609096000 ;
                owl:someValuesFrom [
                    owl:intersectionOf [
                        list:member [
                            rdf:type owl:Restriction ;
                            owl:onProperty id:363698007 ;
                            owl:someValuesFrom ?site
                        ]
                    ]
                ]
            ]
        ]
    ]
    }

The data is not ours so we can't make everything lists.


Re: Weird sparql problem

2022-11-15 Thread Mikael Pesonen
e the triple patterns in a sub-optimal order 
for evaluation even if BGPs are not split.

The short-term fix (again see Andy’s email [1]) is to disable this optimisation 
by default, users can opt back into it if they find it benefits their usage of 
Jena on their datasets.

The long-term fix is probably to rearchitect the logical optimiser in some way 
to allow more data context to be visible to it i.e., making the logical BGP 
reordering statistics aware, making ARQ’s overall optimisation strategy more 
hybrid.  If anyone is interested, I’d imagine there’ll be a thread on this on 
the dev list soon

Hope this helps,

Rob

[1]: https://lists.apache.org/thread/37cloogcb3wzmkl0s33ttnxyg0kvq69p
[2]: http://daslab.seas.harvard.edu/reading-group/papers/volcano.pdf


From: Mikael Pesonen 
Date: Tuesday, 8 November 2022 at 11:04
To: users@jena.apache.org 
Subject: Re: Weird sparql problem
Both your suggestions for rewriting the query worked. I'm lost with the
reasons, but for future cases, breaking problematic queries with {} is
they way to go?

On 04/11/2022 11.25, rve...@dotnetrdf.org wrote:

So yes as suspected the triple patterns are being reordered badly in the BGP:

(sequence
  (table (vars ?sct_code)
(row [?sct_code "298314008"])
  )
  (bgp
(triple ?c skos:inScheme lsu:SNOMEDCT_US)
(triple ?c skosxl:prefLabel ??0)
(triple ??0 lsu:code ?sct_code)
  )))

The optimizer doesn’t take into account the fact that the ?sct_code variable is 
going to be bound by the VALUES clause (table in the algebra) so considers that 
the least specific triple pattern (as it has two variables) causing it to 
evaluate a much less specific triple pattern first.

Lorenz’s suggestion of generating statistics for your dataset is a good one, 
statistics would likely guide the optimiser that the ?c skos:inScheme 
lsu:SNOMEDCT_US triple is actually very non-specific for your dataset.

You could also try Andy’s suggestion else-thread i.e. --set 
arq:optReorderBGP=false passed to the CLI command in question, or if this is 
being called from code ARQ.getContext().set(ARQ.optReorderBGP, false);

The other thing you can do is explicitly break up your query further i.e.

{ VALUES ?sct_code { "298314008" }
{  _:b0  lsu:code  ?sct_code .
  ?cskosxl:prefLabel  _:b0 . }
{  ?cskos:inScheme lsu:SNOMEDCT_US }
}

Essentially forcing the engine to evaluate that very unspecific triple pattern 
last

Another possibility would be to change that triple pattern to be in a FILTER 
EXISTS condition, so it’d only be evaluated for matches to your other triple 
patterns i.e.

{ VALUES ?sct_code { "298314008" }
  _:b0  lsu:code  ?sct_code .
  ?cskosxl:prefLabel  _:b0 .
 FILTER EXISTS {  ?cskos:inScheme lsu:SNOMEDCT_US }
}

Hope this helps,

Rob

From: Lorenz Buehmann 
Date: Thursday, 3 November 2022 at 11:12
To: users@jena.apache.org 
Subject: Re: Re: Weird sparql problem
tdbquery --explain --loc  $TDB_LOC  "query here"

would also work to see the plan - maybe also increase log level to see
more: https://jena.apache.org/documentation/tdb/optimizer.html

Another question, did you generate the TDB stats such those could be
used by the optimizer?

for debugging purpose, you could also disable query optimization (put an
empty none.opt file into $TDB_LOC/Data-0001 dir)  and reorder your query
manually, i.e.


WHERE
{ VALUES ?sct_code { "298314008" }
_:b0  lsu:code  ?sct_code .
  ?cskosxl:prefLabel  _:b0 .
  ?cskos:inScheme lsu:SNOMEDCT_US
}

without stats and based on heuristics (e.g. number of variables in
triple pattern), otherwise the last triple pattern might always be
evaluated first


On 03.11.22 11:11, Mikael Pesonen wrote:

Here's the parse, hope it helps:

WHERE
{ VALUES ?sct_code { "298314008" }
  ?cskosxl:prefLabel  _:b0 .
  _:b0  lsu:code  ?sct_code .
  ?cskos:inScheme lsu:SNOMEDCT_US
}
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(prefix ((owl: <http://www.w3.org/2002/07/owl#<http://www.w3.org/2002/07/owl>>)
   (rdf: 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns>>)
   (skosxl: 
<http://www.w3.org/2008/05/skos-xl#<http://www.w3.org/2008/05/skos-xl>>)
   (skos: 
<http://www.w3.org/2004/02/skos/core#<http://www.w3.org/2004/02/skos/core>>)
   (dcterms: <http://purl.org/dc/terms/>)
   (rdfs: 
<http://www.w3.org/2000/01/rdf-schema#<http://www.w3.org/2000/01/rdf-schema>>)
   (lsr: <https://resource.lingsoft.fi/>)
   (id: <http://snomed.info/id/>)
   (dcat: <http://www.w3.org/ns/dcat#<http://www.w3.org/ns/dcat>>)
   (dc: <http://purl.org/dc/elements/1.1/>)
   (lsu: <htt

Re: Weird sparql problem

2022-11-08 Thread Mikael Pesonen
Both your suggestions for rewriting the query worked. I'm lost with the 
reasons, but for future cases, breaking problematic queries with {} is 
they way to go?


On 04/11/2022 11.25, rve...@dotnetrdf.org wrote:

So yes as suspected the triple patterns are being reordered badly in the BGP:

   (sequence
 (table (vars ?sct_code)
   (row [?sct_code "298314008"])
 )
 (bgp
   (triple ?c skos:inScheme lsu:SNOMEDCT_US)
   (triple ?c skosxl:prefLabel ??0)
   (triple ??0 lsu:code ?sct_code)
 )))

The optimizer doesn’t take into account the fact that the ?sct_code variable is 
going to be bound by the VALUES clause (table in the algebra) so considers that 
the least specific triple pattern (as it has two variables) causing it to 
evaluate a much less specific triple pattern first.

Lorenz’s suggestion of generating statistics for your dataset is a good one, 
statistics would likely guide the optimiser that the ?c skos:inScheme 
lsu:SNOMEDCT_US triple is actually very non-specific for your dataset.

You could also try Andy’s suggestion else-thread i.e. --set 
arq:optReorderBGP=false passed to the CLI command in question, or if this is 
being called from code ARQ.getContext().set(ARQ.optReorderBGP, false);

The other thing you can do is explicitly break up your query further i.e.

{ VALUES ?sct_code { "298314008" }
   {  _:b0  lsu:code  ?sct_code .
 ?cskosxl:prefLabel  _:b0 . }
   {  ?cskos:inScheme lsu:SNOMEDCT_US }
   }

Essentially forcing the engine to evaluate that very unspecific triple pattern 
last

Another possibility would be to change that triple pattern to be in a FILTER 
EXISTS condition, so it’d only be evaluated for matches to your other triple 
patterns i.e.

{ VALUES ?sct_code { "298314008" }
 _:b0  lsu:code  ?sct_code .
 ?cskosxl:prefLabel  _:b0 .
FILTER EXISTS {  ?cskos:inScheme lsu:SNOMEDCT_US }
   }

Hope this helps,

Rob

From: Lorenz Buehmann 
Date: Thursday, 3 November 2022 at 11:12
To: users@jena.apache.org 
Subject: Re: Re: Weird sparql problem
tdbquery --explain --loc  $TDB_LOC  "query here"

would also work to see the plan - maybe also increase log level to see
more: https://jena.apache.org/documentation/tdb/optimizer.html

Another question, did you generate the TDB stats such those could be
used by the optimizer?

for debugging purpose, you could also disable query optimization (put an
empty none.opt file into $TDB_LOC/Data-0001 dir)  and reorder your query
manually, i.e.


WHERE
   { VALUES ?sct_code { "298314008" }
   _:b0  lsu:code  ?sct_code .
 ?cskosxl:prefLabel  _:b0 .
 ?cskos:inScheme lsu:SNOMEDCT_US
   }

without stats and based on heuristics (e.g. number of variables in
triple pattern), otherwise the last triple pattern might always be
evaluated first


On 03.11.22 11:11, Mikael Pesonen wrote:

Here's the parse, hope it helps:

WHERE
   { VALUES ?sct_code { "298314008" }
 ?cskosxl:prefLabel  _:b0 .
 _:b0  lsu:code  ?sct_code .
 ?cskos:inScheme lsu:SNOMEDCT_US
   }
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(prefix ((owl: <http://www.w3.org/2002/07/owl#<http://www.w3.org/2002/07/owl>>)
  (rdf: 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns>>)
  (skosxl: 
<http://www.w3.org/2008/05/skos-xl#<http://www.w3.org/2008/05/skos-xl>>)
  (skos: 
<http://www.w3.org/2004/02/skos/core#<http://www.w3.org/2004/02/skos/core>>)
  (dcterms: <http://purl.org/dc/terms/>)
  (rdfs: 
<http://www.w3.org/2000/01/rdf-schema#<http://www.w3.org/2000/01/rdf-schema>>)
  (lsr: <https://resource.lingsoft.fi/>)
  (id: <http://snomed.info/id/>)
  (dcat: <http://www.w3.org/ns/dcat#<http://www.w3.org/ns/dcat>>)
  (dc: <http://purl.org/dc/elements/1.1/>)
  (lsu: <https://www.lingsoft.fi/ns/umls/>))
   (sequence
 (table (vars ?sct_code)
   (row [?sct_code "298314008"])
 )
 (bgp
   (triple ?c skos:inScheme lsu:SNOMEDCT_US)
   (triple ?c skosxl:prefLabel ??0)
   (triple ??0 lsu:code ?sct_code)
 )))


On 02/11/2022 12.32, rve...@dotnetrdf.org wrote:

For these kind of performance issues it is useful to see the SPARQL
algebra for the whole query, not just fragments of the query.  You
can use the qparse command for the version of Jena you are using to
see how it is optimising your queries e.g.

qparse --explain --query example.rq

As Lorenz suggests this may be the optimiser making a bad guess at
the appropriate order in which to evaluate the triple patterns within
the BGP but without the larger query context or the algebra all we
can do is guess.

Rob

From: Mikael Pesonen 
Date: Tuesday, 1 November 2022 at 12:53
To: users@jena.a

Re: Weird sparql problem

2022-11-08 Thread Mikael Pesonen

I ran your version of the query with none.opt and no change. For

|tdbstats --loc=DIR|--desc=assemblerFile [--graph=URI] Could you please 
explain loc and desc parameters? |




On 03/11/2022 13.11, Lorenz Buehmann wrote:

tdbquery --explain --loc  $TDB_LOC  "query here"

would also work to see the plan - maybe also increase log level to see 
more: https://jena.apache.org/documentation/tdb/optimizer.html


Another question, did you generate the TDB stats such those could be 
used by the optimizer?


for debugging purpose, you could also disable query optimization (put 
an empty none.opt file into $TDB_LOC/Data-0001 dir)  and reorder your 
query manually, i.e.



WHERE
  { VALUES ?sct_code { "298314008" }
  _:b0  lsu:code  ?sct_code .
    ?c    skosxl:prefLabel  _:b0 .
    ?c    skos:inScheme lsu:SNOMEDCT_US
  } 


without stats and based on heuristics (e.g. number of variables in 
triple pattern), otherwise the last triple pattern might always be 
evaluated first



On 03.11.22 11:11, Mikael Pesonen wrote:

Here's the parse, hope it helps:

WHERE
  { VALUES ?sct_code { "298314008" }
    ?c    skosxl:prefLabel  _:b0 .
    _:b0  lsu:code  ?sct_code .
    ?c    skos:inScheme lsu:SNOMEDCT_US
  }
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(prefix ((owl: <http://www.w3.org/2002/07/owl#>)
 (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
 (skosxl: <http://www.w3.org/2008/05/skos-xl#>)
 (skos: <http://www.w3.org/2004/02/skos/core#>)
 (dcterms: <http://purl.org/dc/terms/>)
 (rdfs: <http://www.w3.org/2000/01/rdf-schema#>)
 (lsr: <https://resource.lingsoft.fi/>)
 (id: <http://snomed.info/id/>)
 (dcat: <http://www.w3.org/ns/dcat#>)
 (dc: <http://purl.org/dc/elements/1.1/>)
 (lsu: <https://www.lingsoft.fi/ns/umls/>))
  (sequence
    (table (vars ?sct_code)
  (row [?sct_code "298314008"])
    )
    (bgp
  (triple ?c skos:inScheme lsu:SNOMEDCT_US)
  (triple ?c skosxl:prefLabel ??0)
  (triple ??0 lsu:code ?sct_code)
    )))


On 02/11/2022 12.32, rve...@dotnetrdf.org wrote:
For these kind of performance issues it is useful to see the SPARQL 
algebra for the whole query, not just fragments of the query.  You 
can use the qparse command for the version of Jena you are using to 
see how it is optimising your queries e.g.


qparse --explain --query example.rq

As Lorenz suggests this may be the optimiser making a bad guess at 
the appropriate order in which to evaluate the triple patterns 
within the BGP but without the larger query context or the algebra 
all we can do is guess.


Rob

From: Mikael Pesonen 
Date: Tuesday, 1 November 2022 at 12:53
To: users@jena.apache.org 
Subject: Re: Weird sparql problem
Diferent case, but again hanging makes no sense to user, whatever are
the technical reasons.

   VALUES ?sct_code { "298314008" }
 ?c skosxl:prefLabel [ lsu:code ?sct_code ]

returns one row immediately, but

   VALUES ?sct_code { "298314008" }
 ?c skosxl:prefLabel [ lsu:code ?sct_code ]; skos:inScheme
lsu:SNOMEDCT_US

hangs forever


   skos:inScheme lsu:SNOMEDCT_US;

On 18/10/2022 9.08, Lorenz Buehmann wrote:

Hi,

comments inline

On 17.10.22 14:35, Mikael Pesonen wrote:

This works as a separate query, but not in a the middle, since ?s
gets new values instead of binding to previous ?s.

{ select ?t where {
?s a ?t .
  } limit 10}
   ?t skos:prefLabel ?l


In the middle of what? Subqueries will be evaluated first - if you
really want labels for classes, you should use a DISTINCT in the
subquery such that the intermediate result is small, there shouldn't
be that many classes, but many instances with the same class, thus,
the join would be more expensive than necessary.



On 17/10/2022 14.56, Mikael Pesonen wrote:

?s a ?t .
   ?t skos:prefLabel ?l

returns 3 million triples. Maybe it's related to this?

I don't see how this should be related to  your initial query where ?s
was bound, which in my opinion should be an easy join. Is it possible
for you to share the dataset somehow? Also, what you can do is to
compute statistics for the TDB database with tdbstats tool [1] from
commandline and put it into the TDB folder. But even without the query
plan should take the first triple pattern, use the spo index as s and
p are bound, then pass the bindings of ?o to the evaluation of the
second triple pattern

[1]
https://jena.apache.org/documentation/tdb/optimizer.html#generating-a-statistics-file 






On 21/09/2022 9.15, Lorenz Buehmann wrote:

Weird, only 10M triples and each triple pattern returns only 1
binding, thus, the size is tiny - honestly I can't think of
anything except for open connections, but as you mentioned, running
the queries with only one triple pattern works as expected, so that
too many open connections shouldn't be an issu

Re: Weird sparql problem

2022-11-03 Thread Mikael Pesonen

Here's the parse, hope it helps:

WHERE
  { VALUES ?sct_code { "298314008" }
    ?c    skosxl:prefLabel  _:b0 .
    _:b0  lsu:code  ?sct_code .
    ?c    skos:inScheme lsu:SNOMEDCT_US
  }
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(prefix ((owl: <http://www.w3.org/2002/07/owl#>)
 (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
 (skosxl: <http://www.w3.org/2008/05/skos-xl#>)
 (skos: <http://www.w3.org/2004/02/skos/core#>)
 (dcterms: <http://purl.org/dc/terms/>)
 (rdfs: <http://www.w3.org/2000/01/rdf-schema#>)
 (lsr: <https://resource.lingsoft.fi/>)
 (id: <http://snomed.info/id/>)
 (dcat: <http://www.w3.org/ns/dcat#>)
 (dc: <http://purl.org/dc/elements/1.1/>)
 (lsu: <https://www.lingsoft.fi/ns/umls/>))
  (sequence
    (table (vars ?sct_code)
  (row [?sct_code "298314008"])
    )
    (bgp
  (triple ?c skos:inScheme lsu:SNOMEDCT_US)
  (triple ?c skosxl:prefLabel ??0)
  (triple ??0 lsu:code ?sct_code)
    )))


On 02/11/2022 12.32, rve...@dotnetrdf.org wrote:

For these kind of performance issues it is useful to see the SPARQL algebra for 
the whole query, not just fragments of the query.  You can use the qparse 
command for the version of Jena you are using to see how it is optimising your 
queries e.g.

qparse --explain --query example.rq

As Lorenz suggests this may be the optimiser making a bad guess at the 
appropriate order in which to evaluate the triple patterns within the BGP but 
without the larger query context or the algebra all we can do is guess.

Rob

From: Mikael Pesonen 
Date: Tuesday, 1 November 2022 at 12:53
To: users@jena.apache.org 
Subject: Re: Weird sparql problem
Diferent case, but again hanging makes no sense to user, whatever are
the technical reasons.

   VALUES ?sct_code { "298314008" }
 ?c skosxl:prefLabel [ lsu:code ?sct_code ]

returns one row immediately, but

   VALUES ?sct_code { "298314008" }
 ?c skosxl:prefLabel [ lsu:code ?sct_code ]; skos:inScheme
lsu:SNOMEDCT_US

hangs forever


   skos:inScheme lsu:SNOMEDCT_US;

On 18/10/2022 9.08, Lorenz Buehmann wrote:

Hi,

comments inline

On 17.10.22 14:35, Mikael Pesonen wrote:

This works as a separate query, but not in a the middle, since ?s
gets new values instead of binding to previous ?s.

{ select ?t where {
?s a ?t .
  } limit 10}
   ?t skos:prefLabel ?l


In the middle of what? Subqueries will be evaluated first -  if you
really want labels for classes, you should use a DISTINCT in the
subquery such that the intermediate result is small, there shouldn't
be that many classes, but many instances with the same class, thus,
the join would be more expensive than necessary.



On 17/10/2022 14.56, Mikael Pesonen wrote:

?s a ?t .
   ?t skos:prefLabel ?l

returns 3 million triples. Maybe it's related to this?

I don't see how this should be related to  your initial query where ?s
was bound, which in my opinion should be an easy join. Is it possible
for you to share the dataset somehow? Also, what you can do is to
compute statistics for the TDB database with tdbstats tool [1] from
commandline and put it into the TDB folder. But even without the query
plan should take the first triple pattern, use the spo index as s and
p are bound, then pass the bindings of ?o to the evaluation of the
second triple pattern

[1]
https://jena.apache.org/documentation/tdb/optimizer.html#generating-a-statistics-file




On 21/09/2022 9.15, Lorenz Buehmann wrote:

Weird, only 10M triples and each triple pattern returns only 1
binding, thus, the size is tiny - honestly I can't think of
anything except for open connections, but as you mentioned, running
the queries with only one triple pattern works as expected, so that
too many open connections shouldn't be an issue most likely.

Can you reproduce this behavior with newer Jena versions like 4.6.1?

Or can you reproduce this on different servers as well?

Is it also stuck of your run the query directly after you restart
Fuseki?


On 19.09.22 13:49, Mikael Pesonen wrote:


On 15/09/2022 17.48, Lorenz Buehmann wrote:

Forgot:

- size of result for each triple pattern? Might affect if hash
join can be used.

It's one row for each.

- your hardware?

Normal server with 16gigs mem.

- is it just the first query after starting Fuseki? Connections
have been closed? Note, there was also a bug in a recent Jena
version, but only with TDB and too many open connections. It has
been resolved with release 4.6.1.

Jena has been running quite a while.

Might not be related, but I'm mentioning all things here
nevertheless.


On 15.09.22 11:16, Mikael Pesonen wrote:

This returns one row fast, say :C1

SELECT *
FROM <https://a.b.c>
WHERE {
   <https://x.y.z> a ?t .
   #?t skos:prefLabel ?l
}


and this too:

SELECT *
FROM <https://a.b.c>
WHERE {
   #

Re: Weird sparql problem

2022-11-03 Thread Mikael Pesonen



Where do I get this qparse?

On 02/11/2022 12.32, rve...@dotnetrdf.org wrote:

For these kind of performance issues it is useful to see the SPARQL algebra for 
the whole query, not just fragments of the query.  You can use the qparse 
command for the version of Jena you are using to see how it is optimising your 
queries e.g.

qparse --explain --query example.rq

As Lorenz suggests this may be the optimiser making a bad guess at the 
appropriate order in which to evaluate the triple patterns within the BGP but 
without the larger query context or the algebra all we can do is guess.

Rob

From: Mikael Pesonen 
Date: Tuesday, 1 November 2022 at 12:53
To: users@jena.apache.org 
Subject: Re: Weird sparql problem
Diferent case, but again hanging makes no sense to user, whatever are
the technical reasons.

   VALUES ?sct_code { "298314008" }
 ?c skosxl:prefLabel [ lsu:code ?sct_code ]

returns one row immediately, but

   VALUES ?sct_code { "298314008" }
 ?c skosxl:prefLabel [ lsu:code ?sct_code ]; skos:inScheme
lsu:SNOMEDCT_US

hangs forever


   skos:inScheme lsu:SNOMEDCT_US;

On 18/10/2022 9.08, Lorenz Buehmann wrote:

Hi,

comments inline

On 17.10.22 14:35, Mikael Pesonen wrote:

This works as a separate query, but not in a the middle, since ?s
gets new values instead of binding to previous ?s.

{ select ?t where {
?s a ?t .
  } limit 10}
   ?t skos:prefLabel ?l


In the middle of what? Subqueries will be evaluated first -  if you
really want labels for classes, you should use a DISTINCT in the
subquery such that the intermediate result is small, there shouldn't
be that many classes, but many instances with the same class, thus,
the join would be more expensive than necessary.



On 17/10/2022 14.56, Mikael Pesonen wrote:

?s a ?t .
   ?t skos:prefLabel ?l

returns 3 million triples. Maybe it's related to this?

I don't see how this should be related to  your initial query where ?s
was bound, which in my opinion should be an easy join. Is it possible
for you to share the dataset somehow? Also, what you can do is to
compute statistics for the TDB database with tdbstats tool [1] from
commandline and put it into the TDB folder. But even without the query
plan should take the first triple pattern, use the spo index as s and
p are bound, then pass the bindings of ?o to the evaluation of the
second triple pattern

[1]
https://jena.apache.org/documentation/tdb/optimizer.html#generating-a-statistics-file




On 21/09/2022 9.15, Lorenz Buehmann wrote:

Weird, only 10M triples and each triple pattern returns only 1
binding, thus, the size is tiny - honestly I can't think of
anything except for open connections, but as you mentioned, running
the queries with only one triple pattern works as expected, so that
too many open connections shouldn't be an issue most likely.

Can you reproduce this behavior with newer Jena versions like 4.6.1?

Or can you reproduce this on different servers as well?

Is it also stuck of your run the query directly after you restart
Fuseki?


On 19.09.22 13:49, Mikael Pesonen wrote:


On 15/09/2022 17.48, Lorenz Buehmann wrote:

Forgot:

- size of result for each triple pattern? Might affect if hash
join can be used.

It's one row for each.

- your hardware?

Normal server with 16gigs mem.

- is it just the first query after starting Fuseki? Connections
have been closed? Note, there was also a bug in a recent Jena
version, but only with TDB and too many open connections. It has
been resolved with release 4.6.1.

Jena has been running quite a while.

Might not be related, but I'm mentioning all things here
nevertheless.


On 15.09.22 11:16, Mikael Pesonen wrote:

This returns one row fast, say :C1

SELECT *
FROM <https://a.b.c>
WHERE {
   <https://x.y.z> a ?t .
   #?t skos:prefLabel ?l
}


and this too:

SELECT *
FROM <https://a.b.c>
WHERE {
   #<https://x.y.z> a ?t .
   :C1 skos:prefLabel ?l
}


But this always hangs until timeout

SELECT *
FROM <https://a.b.c>
WHERE {
   <https://x.y.z> a ?t .
   ?t skos:prefLabel ?l
}

What am I missing here? I'm using Fuseki web GUI. Thanks!

--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi<http://www.lingsoft.fi>

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Compact does nothing

2022-11-03 Thread Mikael Pesonen

Ah, forgot that, thanks!

On 02/11/2022 18.58, Andy Seaborne wrote:



On 02/11/2022 16:25, Mikael Pesonen wrote:


Should compact wok without additional configuration? This

curl -X POST http://localhost:3030/$/compact/ds

ran for maybe 30 minutes but didn't free any disk space. I have Jena 
data size of 110Gb and deleted about half of data with sparql before 
compact.


Compacts copies the in-use data from e.g. /ds/Data-0001 to /ds/Data-0002

You can then delete Data-0001 or archive then delete it, or whatever 
you wish to do.


The next Fuseki version has ?deleteOld=true to delete the files.

    Andy


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Compact does nothing

2022-11-02 Thread Mikael Pesonen



Should compact wok without additional configuration? This

curl -X POST http://localhost:3030/$/compact/ds

ran for maybe 30 minutes but didn't free any disk space. I have Jena 
data size of 110Gb and deleted about half of data with sparql before 
compact.


Re: Weird sparql problem

2022-11-02 Thread Mikael Pesonen



BIND( "298314008" AS ?sct_code )
   ?c skosxl:prefLabel [ lsu:code ?sct_code ]; skos:inScheme 
lsu:SNOMEDCT_US


takes a long time also, about 8 minutes (all previous are also slow but 
finish)


On 02/11/2022 9.35, Lorenz Buehmann wrote:

I think if you use

BIND( "298314008" AS ?sct_code )

it would work for the second query?

Looks the the query optimizer does the join in wrong order

@Andy?

On 01.11.22 13:52, Mikael Pesonen wrote:
Diferent case, but again hanging makes no sense to user, whatever are 
the technical reasons.


 VALUES ?sct_code { "298314008" }
   ?c skosxl:prefLabel [ lsu:code ?sct_code ]

returns one row immediately, but

 VALUES ?sct_code { "298314008" }
   ?c skosxl:prefLabel [ lsu:code ?sct_code ]; skos:inScheme 
lsu:SNOMEDCT_US


hangs forever


 skos:inScheme lsu:SNOMEDCT_US;

On 18/10/2022 9.08, Lorenz Buehmann wrote:

Hi,

comments inline

On 17.10.22 14:35, Mikael Pesonen wrote:
This works as a separate query, but not in a the middle, since ?s 
gets new values instead of binding to previous ?s.


{ select ?t where {
?s a ?t .
 } limit 10}
  ?t skos:prefLabel ?l



In the middle of what? Subqueries will be evaluated first - if you 
really want labels for classes, you should use a DISTINCT in the 
subquery such that the intermediate result is small, there shouldn't 
be that many classes, but many instances with the same class, thus, 
the join would be more expensive than necessary.





On 17/10/2022 14.56, Mikael Pesonen wrote:


?s a ?t .
  ?t skos:prefLabel ?l

returns 3 million triples. Maybe it's related to this?


I don't see how this should be related to  your initial query where 
?s was bound, which in my opinion should be an easy join. Is it 
possible for you to share the dataset somehow? Also, what you can do 
is to compute statistics for the TDB database with tdbstats tool [1] 
from commandline and put it into the TDB folder. But even without 
the query plan should take the first triple pattern, use the spo 
index as s and p are bound, then pass the bindings of ?o to the 
evaluation of the second triple pattern


[1] 
https://jena.apache.org/documentation/tdb/optimizer.html#generating-a-statistics-file






On 21/09/2022 9.15, Lorenz Buehmann wrote:
Weird, only 10M triples and each triple pattern returns only 1 
binding, thus, the size is tiny - honestly I can't think of 
anything except for open connections, but as you mentioned, 
running the queries with only one triple pattern works as 
expected, so that too many open connections shouldn't be an issue 
most likely.


Can you reproduce this behavior with newer Jena versions like 4.6.1?

Or can you reproduce this on different servers as well?

Is it also stuck of your run the query directly after you restart 
Fuseki?



On 19.09.22 13:49, Mikael Pesonen wrote:



On 15/09/2022 17.48, Lorenz Buehmann wrote:

Forgot:

- size of result for each triple pattern? Might affect if hash 
join can be used.

It's one row for each.


- your hardware?

Normal server with 16gigs mem.


- is it just the first query after starting Fuseki? Connections 
have been closed? Note, there was also a bug in a recent Jena 
version, but only with TDB and too many open connections. It 
has been resolved with release 4.6.1.

Jena has been running quite a while.


Might not be related, but I'm mentioning all things here 
nevertheless.



On 15.09.22 11:16, Mikael Pesonen wrote:


This returns one row fast, say :C1

SELECT *
FROM <https://a.b.c>
WHERE {
  <https://x.y.z> a ?t .
  #?t skos:prefLabel ?l
}


and this too:

SELECT *
FROM <https://a.b.c>
WHERE {
  #<https://x.y.z> a ?t .
  :C1 skos:prefLabel ?l
}


But this always hangs until timeout

SELECT *
FROM <https://a.b.c>
WHERE {
  <https://x.y.z> a ?t .
  ?t skos:prefLabel ?l
}

What am I missing here? I'm using Fuseki web GUI. Thanks!










--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Weird sparql problem

2022-11-01 Thread Mikael Pesonen
Diferent case, but again hanging makes no sense to user, whatever are 
the technical reasons.


 VALUES ?sct_code { "298314008" }
   ?c skosxl:prefLabel [ lsu:code ?sct_code ]

returns one row immediately, but

 VALUES ?sct_code { "298314008" }
   ?c skosxl:prefLabel [ lsu:code ?sct_code ]; skos:inScheme 
lsu:SNOMEDCT_US


hangs forever


 skos:inScheme lsu:SNOMEDCT_US;

On 18/10/2022 9.08, Lorenz Buehmann wrote:

Hi,

comments inline

On 17.10.22 14:35, Mikael Pesonen wrote:
This works as a separate query, but not in a the middle, since ?s 
gets new values instead of binding to previous ?s.


{ select ?t where {
?s a ?t .
 } limit 10}
  ?t skos:prefLabel ?l



In the middle of what? Subqueries will be evaluated first -  if you 
really want labels for classes, you should use a DISTINCT in the 
subquery such that the intermediate result is small, there shouldn't 
be that many classes, but many instances with the same class, thus, 
the join would be more expensive than necessary.





On 17/10/2022 14.56, Mikael Pesonen wrote:


?s a ?t .
  ?t skos:prefLabel ?l

returns 3 million triples. Maybe it's related to this?


I don't see how this should be related to  your initial query where ?s 
was bound, which in my opinion should be an easy join. Is it possible 
for you to share the dataset somehow? Also, what you can do is to 
compute statistics for the TDB database with tdbstats tool [1] from 
commandline and put it into the TDB folder. But even without the query 
plan should take the first triple pattern, use the spo index as s and 
p are bound, then pass the bindings of ?o to the evaluation of the 
second triple pattern


[1] 
https://jena.apache.org/documentation/tdb/optimizer.html#generating-a-statistics-file






On 21/09/2022 9.15, Lorenz Buehmann wrote:
Weird, only 10M triples and each triple pattern returns only 1 
binding, thus, the size is tiny - honestly I can't think of 
anything except for open connections, but as you mentioned, running 
the queries with only one triple pattern works as expected, so that 
too many open connections shouldn't be an issue most likely.


Can you reproduce this behavior with newer Jena versions like 4.6.1?

Or can you reproduce this on different servers as well?

Is it also stuck of your run the query directly after you restart 
Fuseki?



On 19.09.22 13:49, Mikael Pesonen wrote:



On 15/09/2022 17.48, Lorenz Buehmann wrote:

Forgot:

- size of result for each triple pattern? Might affect if hash 
join can be used.

It's one row for each.


- your hardware?

Normal server with 16gigs mem.


- is it just the first query after starting Fuseki? Connections 
have been closed? Note, there was also a bug in a recent Jena 
version, but only with TDB and too many open connections. It has 
been resolved with release 4.6.1.

Jena has been running quite a while.


Might not be related, but I'm mentioning all things here 
nevertheless.



On 15.09.22 11:16, Mikael Pesonen wrote:


This returns one row fast, say :C1

SELECT *
FROM <https://a.b.c>
WHERE {
  <https://x.y.z> a ?t .
  #?t skos:prefLabel ?l
}


and this too:

SELECT *
FROM <https://a.b.c>
WHERE {
  #<https://x.y.z> a ?t .
  :C1 skos:prefLabel ?l
}


But this always hangs until timeout

SELECT *
FROM <https://a.b.c>
WHERE {
  <https://x.y.z> a ?t .
  ?t skos:prefLabel ?l
}

What am I missing here? I'm using Fuseki web GUI. Thanks!








--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Jena won't start (404)

2022-10-21 Thread Mikael Pesonen



Actually there was some html files missing in webapp folder. Copying 
them from older release solved the issue.


On 21/10/2022 16.34, Øyvind Gjesdal wrote:

I  think this may be due to the new Vue GUI, and the old GUI not being
present somewhere after 4.3.2?

When I check the path for our 4.6.1 gui I also get 404 on the same path you
use. When I click around in the new UI the URI is http://localhost:3030/#/
for dataset and  http://localhost:3030/#/manage for admin actions.

Best regards,
Øyvind Gjesdal

On Fri, Oct 21, 2022 at 12:27 PM Mikael Pesonen 
wrote:


jena-fuseki-4.3.2 works in same setup, 4.6.0 and 4.6.1 don't. Any idea?

On 17/10/2022 16.45, Mikael Pesonen wrote:

Also /opt/xx/jena is link to /opt/xx/apache-jena-fuseki-4.6.1

On 17/10/2022 16.42, Mikael Pesonen wrote:

We have installation where jena data and text folders are soft links
to another drive (not sure if related). When starting server, this
occurs:
Oct 17 16:24:45  systemd[1]: fuseki.service: Consumed 7.547s
CPU time.
Oct 17 16:24:45  systemd[1]: Started Apache Jena Fuseki.
Oct 17 16:24:46  java[2210680]: [2022-10-17 16:24:46]
Server INFO  Apache Jena Fuseki 4.6.1
Oct 17 16:24:46  java[2210680]: [2022-10-17 16:24:46]
ContextHandler WARN  BaseResource file:///opt/xx/jena/webapp/ is
aliased to file:///opt/xx/apache-jena-fuseki-4.6.1/webapp/ in
o.e.j.w.WebAppContext@b8e246c{org.apache.jena.fuseki.Servlet,/,file:///opt/xx/jena/webapp/,STOPPED}.
May not be supported in future releases.

and fuseki web gui at
https:///fuseki/dataset.html?tab=query=/ds says
Error 404: Not Found

Any idea how to fix this?

--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and

Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail:mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND

--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and
Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail:mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Jena won't start (404)

2022-10-21 Thread Mikael Pesonen


jena-fuseki-4.3.2 works in same setup, 4.6.0 and 4.6.1 don't. Any idea?

On 17/10/2022 16.45, Mikael Pesonen wrote:


Also /opt/xx/jena is link to /opt/xx/apache-jena-fuseki-4.6.1

On 17/10/2022 16.42, Mikael Pesonen wrote:


We have installation where jena data and text folders are soft links 
to another drive (not sure if related). When starting server, this 
occurs:
Oct 17 16:24:45  systemd[1]: fuseki.service: Consumed 7.547s 
CPU time.

Oct 17 16:24:45  systemd[1]: Started Apache Jena Fuseki.
Oct 17 16:24:46  java[2210680]: [2022-10-17 16:24:46] 
Server INFO  Apache Jena Fuseki 4.6.1
Oct 17 16:24:46  java[2210680]: [2022-10-17 16:24:46] 
ContextHandler WARN  BaseResource file:///opt/xx/jena/webapp/ is 
aliased to file:///opt/xx/apache-jena-fuseki-4.6.1/webapp/ in 
o.e.j.w.WebAppContext@b8e246c{org.apache.jena.fuseki.Servlet,/,file:///opt/xx/jena/webapp/,STOPPED}. 
May not be supported in future releases.


and fuseki web gui at 
https:///fuseki/dataset.html?tab=query=/ds says

Error 404: Not Found

Any idea how to fix this?


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail:mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail:mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND


Re: SPARQL limit doesn't work

2022-10-20 Thread Mikael Pesonen



I had to reset all Jena data since server ran out of memory with drop 
graph. Now with clean data paging works. I'll let you know if problem 
repeats.


On 20/10/2022 9.37, Lorenz Buehmann wrote:


On 19.10.22 13:44, Mikael Pesonen wrote:




On 19/10/2022 10.18, Lorenz Buehmann wrote:
Honestly - probably because of lack of knowledge - I don't see how 
that can happen with the text index. You have a single triple 
pattern that is querying the Lucene index for the given pattern and 
returns by default at most 10 000 documents.



text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" )

translates to


( (prefLabel:"\"xx yy\"" OR altLabel:"\"xx yy\"") AND lang:en)
which indeed can return duplicate documents as for each triple a 
separate document is created and indexed.


I still don't get how a query with limit 1000 returning 560 then 
doesn't return 100 if using limit 100


Currently, I find your results quite counter intuitive, but I still 
have to learn a log when using RDF, SPARQL and Jena.



Can you share some data please to reproduce?
Unfortunately I can't share the data. Of course when time, I could 
create similar dummy index.


What happens for a single property only? 


What does this mean?
you're querying two properties aka two fields in the Lucene query. 
What if you just use skos:prefLabel ?


Pagination should work as you're doing, the Lucene query is 
internally executed once, then cached - for later requests the same 
Lucene documents hits should be reused


On 19.10.22 08:21, Mikael Pesonen wrote:


Hi,

yes, same select as only query gets exactly limit amount of triples.

On 18/10/2022 16.48, Lorenz Buehmann wrote:
did you get those results when running only this subquery? Afaik, 
the default limit of the Lucene text query is at most 10 000 
documents - and I don't think that the outer LIMIT would make it 
to the Lucene request



On 18.10.22 13:35, Mikael Pesonen wrote:


I have a bigger query that starts with inner select

 { SELECT ?s ?score WHERE {
    (?s ?score) text:query (skos:prefLabel skos:altLabel "\"xx 
yy\"" "lang:en" ) .

    } order by desc(?score) offset 0 limit 1000 }

There are about 1 results. limit 1000 returns ~560 and limit 
100 ~75 results. How do I page results correctly?






--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: SPARQL limit doesn't work

2022-10-19 Thread Mikael Pesonen





On 19/10/2022 10.18, Lorenz Buehmann wrote:
Honestly - probably because of lack of knowledge - I don't see how 
that can happen with the text index. You have a single triple pattern 
that is querying the Lucene index for the given pattern and returns by 
default at most 10 000 documents.



text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" )

translates to


( (prefLabel:"\"xx yy\"" OR altLabel:"\"xx yy\"") AND lang:en)
which indeed can return duplicate documents as for each triple a 
separate document is created and indexed.


I still don't get how a query with limit 1000 returning 560 then 
doesn't return 100 if using limit 100


Currently, I find your results quite counter intuitive, but I still 
have to learn a log when using RDF, SPARQL and Jena.



Can you share some data please to reproduce?
Unfortunately I can't share the data. Of course when time, I could 
create similar dummy index.


What happens for a single property only? 


What does this mean?

Pagination should work as you're doing, the Lucene query is internally 
executed once, then cached - for later requests the same Lucene 
documents hits should be reused


On 19.10.22 08:21, Mikael Pesonen wrote:


Hi,

yes, same select as only query gets exactly limit amount of triples.

On 18/10/2022 16.48, Lorenz Buehmann wrote:
did you get those results when running only this subquery? Afaik, 
the default limit of the Lucene text query is at most 10 000 
documents - and I don't think that the outer LIMIT would make it to 
the Lucene request



On 18.10.22 13:35, Mikael Pesonen wrote:


I have a bigger query that starts with inner select

 { SELECT ?s ?score WHERE {
    (?s ?score) text:query (skos:prefLabel skos:altLabel "\"xx 
yy\"" "lang:en" ) .

    } order by desc(?score) offset 0 limit 1000 }

There are about 1 results. limit 1000 returns ~560 and limit 
100 ~75 results. How do I page results correctly?




--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: SPARQL limit doesn't work

2022-10-19 Thread Mikael Pesonen



Hi,

yes, same select as only query gets exactly limit amount of triples.

On 18/10/2022 16.48, Lorenz Buehmann wrote:
did you get those results when running only this subquery? Afaik, the 
default limit of the Lucene text query is at most 10 000 documents - 
and I don't think that the outer LIMIT would make it to the Lucene 
request



On 18.10.22 13:35, Mikael Pesonen wrote:


I have a bigger query that starts with inner select

 { SELECT ?s ?score WHERE {
    (?s ?score) text:query (skos:prefLabel skos:altLabel "\"xx yy\"" 
"lang:en" ) .

    } order by desc(?score) offset 0 limit 1000 }

There are about 1 results. limit 1000 returns ~560 and limit 100 
~75 results. How do I page results correctly?


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



SPARQL limit doesn't work

2022-10-18 Thread Mikael Pesonen



I have a bigger query that starts with inner select

 { SELECT ?s ?score WHERE {
    (?s ?score) text:query (skos:prefLabel skos:altLabel "\"xx yy\"" 
"lang:en" ) .

    } order by desc(?score) offset 0 limit 1000 }

There are about 1 results. limit 1000 returns ~560 and limit 100 ~75 
results. How do I page results correctly?


Re: Jena won't start (404)

2022-10-17 Thread Mikael Pesonen


Also /opt/xx/jena is link to /opt/xx/apache-jena-fuseki-4.6.1

On 17/10/2022 16.42, Mikael Pesonen wrote:


We have installation where jena data and text folders are soft links 
to another drive (not sure if related). When starting server, this 
occurs:
Oct 17 16:24:45  systemd[1]: fuseki.service: Consumed 7.547s CPU 
time.

Oct 17 16:24:45  systemd[1]: Started Apache Jena Fuseki.
Oct 17 16:24:46  java[2210680]: [2022-10-17 16:24:46] Server 
INFO  Apache Jena Fuseki 4.6.1
Oct 17 16:24:46  java[2210680]: [2022-10-17 16:24:46] 
ContextHandler WARN  BaseResource file:///opt/xx/jena/webapp/ is 
aliased to file:///opt/xx/apache-jena-fuseki-4.6.1/webapp/ in 
o.e.j.w.WebAppContext@b8e246c{org.apache.jena.fuseki.Servlet,/,file:///opt/xx/jena/webapp/,STOPPED}. 
May not be supported in future releases.


and fuseki web gui at 
https:///fuseki/dataset.html?tab=query=/ds says

Error 404: Not Found

Any idea how to fix this?


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail:mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND


Jena won't start (404)

2022-10-17 Thread Mikael Pesonen


We have installation where jena data and text folders are soft links to 
another drive (not sure if related). When starting server, this occurs:

Oct 17 16:24:45  systemd[1]: fuseki.service: Consumed 7.547s CPU time.
Oct 17 16:24:45  systemd[1]: Started Apache Jena Fuseki.
Oct 17 16:24:46  java[2210680]: [2022-10-17 16:24:46] Server 
INFO  Apache Jena Fuseki 4.6.1
Oct 17 16:24:46  java[2210680]: [2022-10-17 16:24:46] 
ContextHandler WARN  BaseResource file:///opt/xx/jena/webapp/ is aliased 
to file:///opt/xx/apache-jena-fuseki-4.6.1/webapp/ in 
o.e.j.w.WebAppContext@b8e246c{org.apache.jena.fuseki.Servlet,/,file:///opt/xx/jena/webapp/,STOPPED}. 
May not be supported in future releases.


and fuseki web gui at 
https:///fuseki/dataset.html?tab=query=/ds says


Error 404: Not Found

Any idea how to fix this?


Re: Weird sparql problem

2022-10-17 Thread Mikael Pesonen
This works as a separate query, but not in a the middle, since ?s gets 
new values instead of binding to previous ?s.


{ select ?t where {
?s a ?t .
 } limit 10}
  ?t skos:prefLabel ?l

On 17/10/2022 14.56, Mikael Pesonen wrote:


?s a ?t .
  ?t skos:prefLabel ?l

returns 3 million triples. Maybe it's related to this?


On 21/09/2022 9.15, Lorenz Buehmann wrote:
Weird, only 10M triples and each triple pattern returns only 1 
binding, thus, the size is tiny - honestly I can't think of anything 
except for open connections, but as you mentioned, running the 
queries with only one triple pattern works as expected, so that too 
many open connections shouldn't be an issue most likely.


Can you reproduce this behavior with newer Jena versions like 4.6.1?

Or can you reproduce this on different servers as well?

Is it also stuck of your run the query directly after you restart 
Fuseki?



On 19.09.22 13:49, Mikael Pesonen wrote:



On 15/09/2022 17.48, Lorenz Buehmann wrote:

Forgot:

- size of result for each triple pattern? Might affect if hash join 
can be used.

It's one row for each.


- your hardware?

Normal server with 16gigs mem.


- is it just the first query after starting Fuseki? Connections 
have been closed? Note, there was also a bug in a recent Jena 
version, but only with TDB and too many open connections. It has 
been resolved with release 4.6.1.

Jena has been running quite a while.


Might not be related, but I'm mentioning all things here nevertheless.


On 15.09.22 11:16, Mikael Pesonen wrote:


This returns one row fast, say :C1

SELECT *
FROM <https://a.b.c>
WHERE {
  <https://x.y.z> a ?t .
  #?t skos:prefLabel ?l
}


and this too:

SELECT *
FROM <https://a.b.c>
WHERE {
  #<https://x.y.z> a ?t .
  :C1 skos:prefLabel ?l
}


But this always hangs until timeout

SELECT *
FROM <https://a.b.c>
WHERE {
  <https://x.y.z> a ?t .
  ?t skos:prefLabel ?l
}

What am I missing here? I'm using Fuseki web GUI. Thanks!






--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Weird sparql problem

2022-10-17 Thread Mikael Pesonen



?s a ?t .
  ?t skos:prefLabel ?l

returns 3 million triples. Maybe it's related to this?


On 21/09/2022 9.15, Lorenz Buehmann wrote:
Weird, only 10M triples and each triple pattern returns only 1 
binding, thus, the size is tiny - honestly I can't think of anything 
except for open connections, but as you mentioned, running the queries 
with only one triple pattern works as expected, so that too many open 
connections shouldn't be an issue most likely.


Can you reproduce this behavior with newer Jena versions like 4.6.1?

Or can you reproduce this on different servers as well?

Is it also stuck of your run the query directly after you restart Fuseki?


On 19.09.22 13:49, Mikael Pesonen wrote:



On 15/09/2022 17.48, Lorenz Buehmann wrote:

Forgot:

- size of result for each triple pattern? Might affect if hash join 
can be used.

It's one row for each.


- your hardware?

Normal server with 16gigs mem.


- is it just the first query after starting Fuseki? Connections have 
been closed? Note, there was also a bug in a recent Jena version, 
but only with TDB and too many open connections. It has been 
resolved with release 4.6.1.

Jena has been running quite a while.


Might not be related, but I'm mentioning all things here nevertheless.


On 15.09.22 11:16, Mikael Pesonen wrote:


This returns one row fast, say :C1

SELECT *
FROM <https://a.b.c>
WHERE {
  <https://x.y.z> a ?t .
  #?t skos:prefLabel ?l
}


and this too:

SELECT *
FROM <https://a.b.c>
WHERE {
  #<https://x.y.z> a ?t .
  :C1 skos:prefLabel ?l
}


But this always hangs until timeout

SELECT *
FROM <https://a.b.c>
WHERE {
  <https://x.y.z> a ?t .
  ?t skos:prefLabel ?l
}

What am I missing here? I'm using Fuseki web GUI. Thanks!




--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Weird sparql problem

2022-09-21 Thread Mikael Pesonen



Fresh start of the server didn't help. I'll try in a fresh 4.6.1 install 
in few days.


BR

On 21/09/2022 9.15, Lorenz Buehmann wrote:
Weird, only 10M triples and each triple pattern returns only 1 
binding, thus, the size is tiny - honestly I can't think of anything 
except for open connections, but as you mentioned, running the queries 
with only one triple pattern works as expected, so that too many open 
connections shouldn't be an issue most likely.


Can you reproduce this behavior with newer Jena versions like 4.6.1?

Or can you reproduce this on different servers as well?

Is it also stuck of your run the query directly after you restart Fuseki?


On 19.09.22 13:49, Mikael Pesonen wrote:



On 15/09/2022 17.48, Lorenz Buehmann wrote:

Forgot:

- size of result for each triple pattern? Might affect if hash join 
can be used.

It's one row for each.


- your hardware?

Normal server with 16gigs mem.


- is it just the first query after starting Fuseki? Connections have 
been closed? Note, there was also a bug in a recent Jena version, 
but only with TDB and too many open connections. It has been 
resolved with release 4.6.1.

Jena has been running quite a while.


Might not be related, but I'm mentioning all things here nevertheless.


On 15.09.22 11:16, Mikael Pesonen wrote:


This returns one row fast, say :C1

SELECT *
FROM <https://a.b.c>
WHERE {
  <https://x.y.z> a ?t .
  #?t skos:prefLabel ?l
}


and this too:

SELECT *
FROM <https://a.b.c>
WHERE {
  #<https://x.y.z> a ?t .
  :C1 skos:prefLabel ?l
}


But this always hangs until timeout

SELECT *
FROM <https://a.b.c>
WHERE {
  <https://x.y.z> a ?t .
  ?t skos:prefLabel ?l
}

What am I missing here? I'm using Fuseki web GUI. Thanks!




--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Weird sparql problem

2022-09-19 Thread Mikael Pesonen




On 15/09/2022 17.48, Lorenz Buehmann wrote:

Forgot:

- size of result for each triple pattern? Might affect if hash join 
can be used.

It's one row for each.


- your hardware?

Normal server with 16gigs mem.


- is it just the first query after starting Fuseki? Connections have 
been closed? Note, there was also a bug in a recent Jena version, but 
only with TDB and too many open connections. It has been resolved with 
release 4.6.1.

Jena has been running quite a while.


Might not be related, but I'm mentioning all things here nevertheless.


On 15.09.22 11:16, Mikael Pesonen wrote:


This returns one row fast, say :C1

SELECT *
FROM <https://a.b.c>
WHERE {
  <https://x.y.z> a ?t .
  #?t skos:prefLabel ?l
}


and this too:

SELECT *
FROM <https://a.b.c>
WHERE {
  #<https://x.y.z> a ?t .
  :C1 skos:prefLabel ?l
}


But this always hangs until timeout

SELECT *
FROM <https://a.b.c>
WHERE {
  <https://x.y.z> a ?t .
  ?t skos:prefLabel ?l
}

What am I missing here? I'm using Fuseki web GUI. Thanks!


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Weird sparql problem

2022-09-19 Thread Mikael Pesonen



Hi, thanks for looking into this.

On 15/09/2022 17.44, Lorenz Buehmann wrote:

Fuseki with in-memory backend or TDB?


TDB2.

Which version?

4.3.2


How large is the dataset? Not that I see how this simple query with a 
single join should lead to a timeout, but any numbers are usually 
helpful.

It's around 10M triples

Did you try the query without defining the default graph but using a 
graph pattern, i.e.


SELECT *
WHERE {
  GRAPH <https://a.b.c> {<https://x.y.z> a ?t .
  ?t skos:prefLabel ?l }
}

I have 2 graphs in the original query with FROMs. When selecting one of 
them into GRAPH clause, it worked once, but took few minutes. Other 
tries result timeout.


And/or did you try to reorder the triple patterns? The query optimizer 
should prefer the first one though anyways as it can make use of spo 
index (if it would be TDB)

Switching rows doesn't help.


On 15.09.22 11:16, Mikael Pesonen wrote:


This returns one row fast, say :C1

SELECT *
FROM <https://a.b.c>
WHERE {
  <https://x.y.z> a ?t .
  #?t skos:prefLabel ?l
}


and this too:

SELECT *
FROM <https://a.b.c>
WHERE {
  #<https://x.y.z> a ?t .
  :C1 skos:prefLabel ?l
}


But this always hangs until timeout

SELECT *
FROM <https://a.b.c>
WHERE {
  <https://x.y.z> a ?t .
  ?t skos:prefLabel ?l
}

What am I missing here? I'm using Fuseki web GUI. Thanks!


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Weird sparql problem

2022-09-15 Thread Mikael Pesonen



This returns one row fast, say :C1

SELECT *
FROM 
WHERE {
   a ?t .
  #?t skos:prefLabel ?l
}


and this too:

SELECT *
FROM 
WHERE {
  # a ?t .
  :C1 skos:prefLabel ?l
}


But this always hangs until timeout

SELECT *
FROM 
WHERE {
   a ?t .
  ?t skos:prefLabel ?l
}

What am I missing here? I'm using Fuseki web GUI. Thanks!


Re: Store data in text/owl-functional format

2022-04-05 Thread Mikael Pesonen



At this time I need to make one manual conversion only...

On 05/04/2022 13.29, Steve Vestal wrote:
I would be interested to hear what you find out about using owlapi 
with Jena.  If there is a relatively straightforward way to integrate 
it to broaden the formats supported with Jena, that would be a useful 
contribution to the community.  The brass ring would be to also enable 
use of reasoners via the owlapi reasoner interface.


On 4/5/2022 4:52 AM, Mikael Pesonen wrote:


Owlapi will work, thanks!

On 04/04/2022 18.17, Steve Vestal wrote:
I have done this manually using Protege (convert ofn to RDF/XML). 
You could also look at https://github.com/owlcs/owlapi/


On 4/4/2022 8:53 AM, Mikael Pesonen wrote:


Is it possible to send OWL in functional syntax to Jena? Fuseki says

Unknown content type for triples: [text/owl-functional]

when posting with curl.





--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Store data in text/owl-functional format

2022-04-05 Thread Mikael Pesonen



Owlapi will work, thanks!

On 04/04/2022 18.17, Steve Vestal wrote:
I have done this manually using Protege (convert ofn to RDF/XML). You 
could also look at https://github.com/owlcs/owlapi/


On 4/4/2022 8:53 AM, Mikael Pesonen wrote:


Is it possible to send OWL in functional syntax to Jena? Fuseki says

Unknown content type for triples: [text/owl-functional]

when posting with curl.



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Store data in text/owl-functional format

2022-04-04 Thread Mikael Pesonen



Is it possible to send OWL in functional syntax to Jena? Fuseki says

Unknown content type for triples: [text/owl-functional]

when posting with curl.



Re: Safe delete & insert with Fuseki

2022-03-31 Thread Mikael Pesonen



Only thing I could think of is to try the insert first on some temp 
graph to see if it succeed. That can be unnecessary heavy though.


On 26/03/2022 11.12, Andy Seaborne wrote:


If you put all the changes in one update request, they will be done 
atomically.


DELETE { ... } WHERE { ... }
;
INSERT DATA { ... }

Also, the WHERE clause in a DELETE-INSERT-WHERE can be used to "switch 
off" an operation.


    Andy

On 24/03/2022 13:17, Mikael Pesonen wrote:


We have occasionally an issue with replace where inserting new data 
may fail after old data has been deleted. What is the recommended way 
to do this kind of multipart SPARQL with Fuseki (not java)? First 
option comes to mind is to test if SPARQL insert is correct and WOULD 
be executed. Is this possible?




--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Safe delete & insert with Fuseki

2022-03-28 Thread Mikael Pesonen
Queries don't have common enough selection criteria so they have to
executed separately. Otherwise that would be nice solution.

On Sat, 26 Mar 2022 at 11:12, Andy Seaborne  wrote:

>
> If you put all the changes in one update request, they will be done
> atomically.
>
> DELETE { ... } WHERE { ... }
> ;
> INSERT DATA { ... }
>
> Also, the WHERE clause in a DELETE-INSERT-WHERE can be used to "switch
> off" an operation.
>
>  Andy
>
> On 24/03/2022 13:17, Mikael Pesonen wrote:
> >
> > We have occasionally an issue with replace where inserting new data may
> > fail after old data has been deleted. What is the recommended way to do
> > this kind of multipart SPARQL with Fuseki (not java)? First option comes
> > to mind is to test if SPARQL insert is correct and WOULD be executed. Is
> > this possible?
> >
>


Safe delete & insert with Fuseki

2022-03-24 Thread Mikael Pesonen



We have occasionally an issue with replace where inserting new data may 
fail after old data has been deleted. What is the recommended way to do 
this kind of multipart SPARQL with Fuseki (not java)? First option comes 
to mind is to test if SPARQL insert is correct and WOULD be executed. Is 
this possible?




Re: NodeTableTRDF/Read exception

2022-03-22 Thread Mikael Pesonen



We don't have that going on atleast.

On 21/03/2022 20.43, Andy Seaborne wrote:
The only time I have seen anything similar to this is on Android where 
something other process is messing about the files. TDB is not proof 
again other processes accessing the same files, including with shared 
network drives where different computers access the same filesystem.


    Andy

On 21/03/2022 11:39, Mikael Pesonen wrote:


Got this again after few days of little usage after TDB2 was rebuilt 
from empty. Would you suggest this is hw error? No possibility that 
its Jena error?


On 28/05/2021 17.25, Andy Seaborne wrote:



On 28/05/2021 14:59, Mikael Pesonen wrote:


I should try some older Jena/Fuseki version?


Yes.

Also
 - run on different hardware.
 - run multiple times
 - look at the data and see if anything unusual is in it.
 etc etc




On 28/05/2021 16.49, Andy Seaborne wrote:



On 28/05/2021 14:03, Mikael Pesonen wrote:


No this is the fresh db, started from empty today. And plenty of 
disk space.


So it's repeatble.

With no Minimal, Verifiable, Complete Example, it'll have to be an 
on-site investigation. Try different versions.


    Andy



On 28/05/2021 15.58, Andy Seaborne wrote:

Why are you adding data to a broken database?

On 28/05/2021 12:02, Mikael Pesonen wrote:


Actually now it happened again. Same size, about 80MB of 
turtle, imported without warnings this time, but reading the 
graph fails with this exception.


13:59:39 WARN  Fuseki  :: [44] RC = 500 : 
NodeTableTRDF/Read

org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:113) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter.next(Iter.java:1072) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:47) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:31) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:145) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:74) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterBlockTriplesStar.hasNextBinding(QueryIterBlockTriplesStar.java:54) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:101) 
~[fuseki-server.jar:3.17.0

Re: NodeTableTRDF/Read exception

2022-03-21 Thread Mikael Pesonen



Got this again after few days of little usage after TDB2 was rebuilt 
from empty. Would you suggest this is hw error? No possibility that its 
Jena error?


On 28/05/2021 17.25, Andy Seaborne wrote:



On 28/05/2021 14:59, Mikael Pesonen wrote:


I should try some older Jena/Fuseki version?


Yes.

Also
 - run on different hardware.
 - run multiple times
 - look at the data and see if anything unusual is in it.
 etc etc




On 28/05/2021 16.49, Andy Seaborne wrote:



On 28/05/2021 14:03, Mikael Pesonen wrote:


No this is the fresh db, started from empty today. And plenty of 
disk space.


So it's repeatble.

With no Minimal, Verifiable, Complete Example, it'll have to be an 
on-site investigation. Try different versions.


    Andy



On 28/05/2021 15.58, Andy Seaborne wrote:

Why are you adding data to a broken database?

On 28/05/2021 12:02, Mikael Pesonen wrote:


Actually now it happened again. Same size, about 80MB of turtle, 
imported without warnings this time, but reading the graph fails 
with this exception.


13:59:39 WARN  Fuseki  :: [44] RC = 500 : NodeTableTRDF/Read
org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:113) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter.next(Iter.java:1072) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:47) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:31) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:145) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:74) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterBlockTriplesStar.hasNextBinding(QueryIterBlockTriplesStar.java:54) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:101) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:101) 
~[fuseki-server.jar:3.17.0

Re: SPARQL optional limiting results

2022-03-18 Thread Mikael Pesonen



Okay thanks. This stuff is complicated...

On 18/03/2022 16.20, Andy Seaborne wrote:

The OPTIONAL uses ?graph.

So it isn't simple a matter of "optional adds rows" After the optional 
happens, there is a new condition on ?graph.


The inside of GRAPH ?graph { inner } is executed then joined to ensure 
?graph is the right value.


On 18/03/2022 12:52, Mikael Pesonen wrote:


Hi Martynas,

So query below returns some extra columns (but fewer rows) with 
OPTIONAL. With OPTIONAL only one item is returned from graph 
http://www.yso.fi/onto/mesh/ <http://www.yso.fi/onto/mesh/> . Without 
OPTIONAL two items are returned, one from each graph.


Data is available on bottom of these pages:
http://finto.fi/mesh/en/
http://finto.fi/tero/en/


2 files, 600K and 800K lines.

Not a minimal example. Not a data sample.

And its not the data being used here which is a dataset with named 
graphs (different names to their URLs), not Turtle files.





PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX text: <http://jena.apache.org/text#>
SELECT *
WHERE
{
 VALUES ?graph {<http://www.yso.fi/onto/tero/> 
<http://www.yso.fi/onto/mesh/>}

 GRAPH ?graph
 {
 {
 SELECT DISTINCT ?concept ?prefLabelm ?altLabelm WHERE
 {
 {
 (?concept ?score1 ?prefLabelm) text:query 
(skos:prefLabel "aamiainen*") .

 FILTER ( (lang(?prefLabelm) = "fi" ))
 }
 }
 }
 OPTIONAL { ?concept skos:broader* [ skos:topConceptOf 
?graph; skos:prefLabel ?topConceptLabel ] }

 }
}

On 18/03/2022 12.58, Martynas Jusevičius wrote:

Can you provide a full query string and a data sample that illustrate
the problem? Then it's easy to see what's going on, for example on
http://sparql.org/sparql.html.

On Fri, Mar 18, 2022 at 11:52 AM Mikael Pesonen
  wrote:


Is this a problem with query, not with Jena?

On 15/03/2022 9.30, Lorenz Buehmann wrote:

Hi,

I'm probably misunderstanding the query, but what is the purpose of
the OPTIONAL here?

?graph is bound because of VALUES clause, ?concept is bound 
because of

the graph pattern before the OPTIONAL as well.

So ?graph and ?concept are bound on the left hand side of the
left-join aka OPTIONAL

Here is the algebra:

(join
   (table (vars ?graph)
 (row [?graph<http://www.yso.fi/onto/tero/>])
 (row [?graph<http://www.yso.fi/onto/mesh/>])
   )
   (assign ((?graph ?*g0))
 (leftjoin
   (distinct
 (project (?concept ?prefLabelm ?altLabelm)
   (filter (= (lang ?prefLabelm) "fi")
 (quadpattern
   (quad ?*g0 ??0 rdf:first ?concept)
   (quad ?*g0 ??0 rdf:rest ??1)
   (quad ?*g0 ??1 rdf:first ?score1)
   (quad ?*g0 ??1 rdf:rest ??2)
   (quad ?*g0 ??2 rdf:first ?prefLabelm)
   (quad ?*g0 ??2 rdf:rest rdf:nil)
   (quad ?*g0 ??0 text:query ??3)
   (quad ?*g0 ??3 rdf:first skos:prefLabel)
   (quad ?*g0 ??3 rdf:rest ??4)
   (quad ?*g0 ??4 rdf:first "aamiainen*")
   (quad ?*g0 ??4 rdf:rest rdf:nil)
 
   (sequence
 (graph ?*g0
   (path ?concept (path* skos:broader) ??5))
 (quadpattern (quad ?*g0 ??5 skos:topConceptOf 
?graph)



Can you say what you want to achieve with the OPTIONAL maybe, it 
won't

return any additional data as far as I can see.

On 14.03.22 14:30, Mikael Pesonen wrote:

Hi, not directly related to Jena, but I have a query in which
optional clause limits the number of results. I thought it's never
possible. So below query returns less results with optional enabled.
Wonder why is that and what would be the correct way to get optional
data so than all rows are returned?

SELECT *
WHERE
{
 VALUES ?graph {<http://www.yso.fi/onto/tero/>
<http://www.yso.fi/onto/mesh/>}
 GRAPH ?graph
 {
 {
 SELECT DISTINCT ?concept ?prefLabelm ?altLabelm WHERE
 {
 {
 (?concept ?score1 ?prefLabelm) text:query
(skos:prefLabel "aamiainen*") .
 FILTER ( (lang(?prefLabelm) = "fi" ))
 }
 }
 }
    # OPTIONAL { ?concept skos:broader* [ skos:topConceptOf 
?graph] }

 }
}

--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's 
and Writer's Tools - Text Tools - E-books and M-books


Mikael Pesonen
System Engineer

e-mail:mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND





--
Lingsoft - 30 years of Leading 

Re: SPARQL optional limiting results

2022-03-18 Thread Mikael Pesonen


Hi Martynas,

So query below returns some extra columns (but fewer rows) with 
OPTIONAL. With OPTIONAL only one item is returned from graph 
http://www.yso.fi/onto/mesh/ <http://www.yso.fi/onto/mesh/> . Without 
OPTIONAL two items are returned, one from each graph.


Data is available on bottom of these pages:
http://finto.fi/mesh/en/
http://finto.fi/tero/en/


PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX text: <http://jena.apache.org/text#>
SELECT *
WHERE
{
    VALUES ?graph {<http://www.yso.fi/onto/tero/> 
<http://www.yso.fi/onto/mesh/>}

    GRAPH ?graph
    {
    {
    SELECT DISTINCT ?concept ?prefLabelm ?altLabelm WHERE
    {
    {
    (?concept ?score1 ?prefLabelm) text:query 
(skos:prefLabel "aamiainen*") .

    FILTER ( (lang(?prefLabelm) = "fi" ))
    }
    }
    }
    OPTIONAL { ?concept skos:broader* [ skos:topConceptOf ?graph; 
skos:prefLabel ?topConceptLabel ] }

    }
}

On 18/03/2022 12.58, Martynas Jusevičius wrote:

Can you provide a full query string and a data sample that illustrate
the problem? Then it's easy to see what's going on, for example on
http://sparql.org/sparql.html.

On Fri, Mar 18, 2022 at 11:52 AM Mikael Pesonen
  wrote:


Is this a problem with query, not with Jena?

On 15/03/2022 9.30, Lorenz Buehmann wrote:

Hi,

I'm probably misunderstanding the query, but what is the purpose of
the OPTIONAL here?

?graph is bound because of VALUES clause, ?concept is bound because of
the graph pattern before the OPTIONAL as well.

So ?graph and ?concept are bound on the left hand side of the
left-join aka OPTIONAL

Here is the algebra:

(join
   (table (vars ?graph)
 (row [?graph<http://www.yso.fi/onto/tero/>])
 (row [?graph<http://www.yso.fi/onto/mesh/>])
   )
   (assign ((?graph ?*g0))
 (leftjoin
   (distinct
 (project (?concept ?prefLabelm ?altLabelm)
   (filter (= (lang ?prefLabelm) "fi")
 (quadpattern
   (quad ?*g0 ??0 rdf:first ?concept)
   (quad ?*g0 ??0 rdf:rest ??1)
   (quad ?*g0 ??1 rdf:first ?score1)
   (quad ?*g0 ??1 rdf:rest ??2)
   (quad ?*g0 ??2 rdf:first ?prefLabelm)
   (quad ?*g0 ??2 rdf:rest rdf:nil)
   (quad ?*g0 ??0 text:query ??3)
   (quad ?*g0 ??3 rdf:first skos:prefLabel)
   (quad ?*g0 ??3 rdf:rest ??4)
   (quad ?*g0 ??4 rdf:first "aamiainen*")
   (quad ?*g0 ??4 rdf:rest rdf:nil)
 
   (sequence
 (graph ?*g0
   (path ?concept (path* skos:broader) ??5))
 (quadpattern (quad ?*g0 ??5 skos:topConceptOf ?graph)


Can you say what you want to achieve with the OPTIONAL maybe, it won't
return any additional data as far as I can see.

On 14.03.22 14:30, Mikael Pesonen wrote:

Hi, not directly related to Jena, but I have a query in which
optional clause limits the number of results. I thought it's never
possible. So below query returns less results with optional enabled.
Wonder why is that and what would be the correct way to get optional
data so than all rows are returned?

SELECT *
WHERE
{
 VALUES ?graph {<http://www.yso.fi/onto/tero/>
<http://www.yso.fi/onto/mesh/>}
 GRAPH ?graph
 {
 {
 SELECT DISTINCT ?concept ?prefLabelm ?altLabelm WHERE
 {
 {
 (?concept ?score1 ?prefLabelm) text:query
(skos:prefLabel "aamiainen*") .
 FILTER ( (lang(?prefLabelm) = "fi" ))
 }
 }
 }
# OPTIONAL { ?concept skos:broader* [ skos:topConceptOf ?graph] }
 }
}

--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail:mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail:mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND


Re: SPARQL optional limiting results

2022-03-18 Thread Mikael Pesonen



Is this a problem with query, not with Jena?

On 15/03/2022 9.30, Lorenz Buehmann wrote:

Hi,

I'm probably misunderstanding the query, but what is the purpose of 
the OPTIONAL here?


?graph is bound because of VALUES clause, ?concept is bound because of 
the graph pattern before the OPTIONAL as well.


So ?graph and ?concept are bound on the left hand side of the 
left-join aka OPTIONAL


Here is the algebra:

(join
  (table (vars ?graph)
    (row [?graph<http://www.yso.fi/onto/tero/>])
    (row [?graph<http://www.yso.fi/onto/mesh/>])
  )
  (assign ((?graph ?*g0))
    (leftjoin
  (distinct
    (project (?concept ?prefLabelm ?altLabelm)
  (filter (= (lang ?prefLabelm) "fi")
    (quadpattern
  (quad ?*g0 ??0 rdf:first ?concept)
  (quad ?*g0 ??0 rdf:rest ??1)
  (quad ?*g0 ??1 rdf:first ?score1)
  (quad ?*g0 ??1 rdf:rest ??2)
  (quad ?*g0 ??2 rdf:first ?prefLabelm)
  (quad ?*g0 ??2 rdf:rest rdf:nil)
  (quad ?*g0 ??0 text:query ??3)
  (quad ?*g0 ??3 rdf:first skos:prefLabel)
  (quad ?*g0 ??3 rdf:rest ??4)
  (quad ?*g0 ??4 rdf:first "aamiainen*")
  (quad ?*g0 ??4 rdf:rest rdf:nil)
    
  (sequence
    (graph ?*g0
  (path ?concept (path* skos:broader) ??5))
    (quadpattern (quad ?*g0 ??5 skos:topConceptOf ?graph)


Can you say what you want to achieve with the OPTIONAL maybe, it won't 
return any additional data as far as I can see.


On 14.03.22 14:30, Mikael Pesonen wrote:
Hi, not directly related to Jena, but I have a query in which 
optional clause limits the number of results. I thought it's never 
possible. So below query returns less results with optional enabled. 
Wonder why is that and what would be the correct way to get optional 
data so than all rows are returned?


SELECT *
WHERE
{
    VALUES ?graph {<http://www.yso.fi/onto/tero/> 
<http://www.yso.fi/onto/mesh/>}

    GRAPH ?graph
    {
    {
        SELECT DISTINCT ?concept ?prefLabelm ?altLabelm WHERE
        {
            {
                (?concept ?score1 ?prefLabelm) text:query 
(skos:prefLabel "aamiainen*") .

    FILTER ( (lang(?prefLabelm) = "fi" ))
            }
    }
    }
   # OPTIONAL { ?concept skos:broader* [ skos:topConceptOf ?graph] }
    }
}


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: SPARQL optional limiting results

2022-03-15 Thread Mikael Pesonen



Hi,

sorry I cleaned up the example a bit too much. So OPTIONAL is collecting 
additional data like this:


 OPTIONAL { ?concept skos:broader* [ skos:topConceptOf ?graph; 
skos:prefLabel ?topConceptLabel ] }


But even with original example, OPTIONAL shouldn't return fewer rows?


On 15/03/2022 9.30, Lorenz Buehmann wrote:

Hi,

I'm probably misunderstanding the query, but what is the purpose of 
the OPTIONAL here?


?graph is bound because of VALUES clause, ?concept is bound because of 
the graph pattern before the OPTIONAL as well.


So ?graph and ?concept are bound on the left hand side of the 
left-join aka OPTIONAL


Here is the algebra:

(join
  (table (vars ?graph)
    (row [?graph<http://www.yso.fi/onto/tero/>])
    (row [?graph<http://www.yso.fi/onto/mesh/>])
  )
  (assign ((?graph ?*g0))
    (leftjoin
  (distinct
    (project (?concept ?prefLabelm ?altLabelm)
  (filter (= (lang ?prefLabelm) "fi")
    (quadpattern
  (quad ?*g0 ??0 rdf:first ?concept)
  (quad ?*g0 ??0 rdf:rest ??1)
  (quad ?*g0 ??1 rdf:first ?score1)
  (quad ?*g0 ??1 rdf:rest ??2)
  (quad ?*g0 ??2 rdf:first ?prefLabelm)
  (quad ?*g0 ??2 rdf:rest rdf:nil)
  (quad ?*g0 ??0 text:query ??3)
  (quad ?*g0 ??3 rdf:first skos:prefLabel)
  (quad ?*g0 ??3 rdf:rest ??4)
  (quad ?*g0 ??4 rdf:first "aamiainen*")
  (quad ?*g0 ??4 rdf:rest rdf:nil)
    
  (sequence
    (graph ?*g0
  (path ?concept (path* skos:broader) ??5))
    (quadpattern (quad ?*g0 ??5 skos:topConceptOf ?graph)


Can you say what you want to achieve with the OPTIONAL maybe, it won't 
return any additional data as far as I can see.


On 14.03.22 14:30, Mikael Pesonen wrote:
Hi, not directly related to Jena, but I have a query in which 
optional clause limits the number of results. I thought it's never 
possible. So below query returns less results with optional enabled. 
Wonder why is that and what would be the correct way to get optional 
data so than all rows are returned?


SELECT *
WHERE
{
    VALUES ?graph {<http://www.yso.fi/onto/tero/> 
<http://www.yso.fi/onto/mesh/>}

    GRAPH ?graph
    {
    {
        SELECT DISTINCT ?concept ?prefLabelm ?altLabelm WHERE
        {
            {
                (?concept ?score1 ?prefLabelm) text:query 
(skos:prefLabel "aamiainen*") .

    FILTER ( (lang(?prefLabelm) = "fi" ))
            }
    }
    }
   # OPTIONAL { ?concept skos:broader* [ skos:topConceptOf ?graph] }
    }
}


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



SPARQL optional limiting results

2022-03-14 Thread Mikael Pesonen
Hi, not directly related to Jena, but I have a query in which optional 
clause limits the number of results. I thought it's never possible. So 
below query returns less results with optional enabled. Wonder why is 
that and what would be the correct way to get optional data so than all 
rows are returned?


SELECT *
WHERE
{
    VALUES ?graph { 
}

    GRAPH ?graph
    {
    {
        SELECT DISTINCT ?concept ?prefLabelm ?altLabelm WHERE
        {
            {
                (?concept ?score1 ?prefLabelm) text:query 
(skos:prefLabel "aamiainen*") .

    FILTER ( (lang(?prefLabelm) = "fi" ))
            }
    }
    }
   # OPTIONAL { ?concept skos:broader* [ skos:topConceptOf ?graph] }
    }
}


Re: Does Jena need maintainance regarding to disk space?

2022-03-01 Thread Mikael Pesonen



Would be useful if task id and Data- folder names were linked 
somehow. For example display the folder name in task info. Now it's a 
bit complicated to automate the process and make sure you delete only 
outdated folders.


On 14/02/2022 21.30, Andy Seaborne wrote:

Yes, it's a good idea.

TDB2 in Fuseki has the "compact" operation to do this without stopping 
the server. It creates a new "Data-/" directory and you can delete 
lower numbered databases.


TDB1 - needs the server stopping for a rebuild. If you can stop 
updates, stop updates, 9server now read-only) backup, rebuild from 
backup then stop the server and swap the databases.


    Andy

On 14/02/2022 11:39, Mikael Pesonen wrote:


Hi,

we have now 13M triples and space usage of Jena data folder is 88G 
which seems high. This is not including text index.
Should we cleanup/compress/rebuild etc the database regularly in 
order to keep disk usage lower, or is this normal disk usage?


BR



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Does Jena need maintainance regarding to disk space?

2022-02-18 Thread Mikael Pesonen



Yes I think we ran out of disk space also last time and it might corrupt 
something.


How does one dump the database without fuseki?

Thanks!

On 17/02/2022 23.07, Andy Seaborne wrote:

Mikael,

You've had this problem before.

To revert your database:
1/ Stop the Fuseki server
2/ Delete Data-0002
3/ You can bnow restart

Now the undelying problem:

Do you have a backup?

Try to dump the database without Fuskei running.
If that works (I doubt it), run compact outside Fuseki.


The suggestion previously was :
  - run on different hardware.
  - run multiple times
  - look at the data and see if anything unusual is in it.

    Andy

On 17/02/2022 13:46, Mikael Pesonen wrote:


I get this exception with compact after couple of hours:

15:38:41 WARN  Compact :: [10]  Exception in compact
org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:102) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:4.3.2]
 at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:110) 
~[fuseki-server.jar:4.3.2]
 at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:103) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:52) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.atlas.iterator.Iter$IterMap.next(Iter.java:417) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:41) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:4.3.2]
 at java.util.Iterator.forEachRemaining(Iterator.java:133) 
~[?:?]
 at 
org.apache.jena.tdb2.sys.CopyDSG.lambda$copy$0(CopyDSG.java:38) 
~[fuseki-server.jar:4.3.2]
 at org.apache.jena.system.Txn.exec(Txn.java:77) 
~[fuseki-server.jar:4.3.2]
 at org.apache.jena.system.Txn.executeWrite(Txn.java:125) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.tdb2.sys.CopyDSG.lambda$copy$1(CopyDSG.java:36) 
~[fuseki-server.jar:4.3.2]
 at org.apache.jena.system.Txn.exec(Txn.java:77) 
~[fuseki-server.jar:4.3.2]
 at org.apache.jena.system.Txn.executeRead(Txn.java:115) 
~[fuseki-server.jar:4.3.2]
 at org.apache.jena.tdb2.sys.CopyDSG.copy(CopyDSG.java:35) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.tdb2.sys.DatabaseOps.compact(DatabaseOps.java:261) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.tdb2.sys.DatabaseOps.compact(DatabaseOps.java:210) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.tdb2.DatabaseMgr.compact(DatabaseMgr.java:80) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.fuseki.ctl.ActionCompact$CompactTask.run(ActionCompact.java:109) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.fuseki.async.AsyncPool.lambda$submit$0(AsyncPool.java:66) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.fuseki.async.AsyncTask.call(AsyncTask.java:100) 
[fuseki-server.jar:4.3.2]
 at java.util.concurrent.FutureTask.run(FutureTask.java:264) 
[?:?]
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
[?:?]
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
[?:?]

 at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: org.apache.thrift.protocol.TProtocolException: 
Unrecognized type 0
 at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:144) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:60) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.riot.thrift.wire.RDF_Term.standardSchemeReadValue(RDF_Term.java:433) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:224) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213) 
~[fuseki-server.jar:4.3.2]
 at org.apache.thrift.TUnion.read(TUnion.java:138) 
~[fuseki-server.jar:4.3.2]
 at 
org.apache.jena.tdb2

Re: Does Jena need maintainance regarding to disk space?

2022-02-17 Thread Mikael Pesonen
(NodeTableCache.java:206) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:4.3.2]
    at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:110) 
~[fuseki-server.jar:4.3.2]
    at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:103) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:52) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.atlas.iterator.Iter$IterMap.next(Iter.java:417) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:41) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:4.3.2]

    at java.util.Iterator.forEachRemaining(Iterator.java:133) ~[?:?]
    at 
org.apache.jena.tdb2.sys.CopyDSG.lambda$copy$0(CopyDSG.java:38) 
~[fuseki-server.jar:4.3.2]
    at org.apache.jena.system.Txn.exec(Txn.java:77) 
~[fuseki-server.jar:4.3.2]
    at org.apache.jena.system.Txn.executeWrite(Txn.java:125) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.tdb2.sys.CopyDSG.lambda$copy$1(CopyDSG.java:36) 
~[fuseki-server.jar:4.3.2]
    at org.apache.jena.system.Txn.exec(Txn.java:77) 
~[fuseki-server.jar:4.3.2]
    at org.apache.jena.system.Txn.executeRead(Txn.java:115) 
~[fuseki-server.jar:4.3.2]
    at org.apache.jena.tdb2.sys.CopyDSG.copy(CopyDSG.java:35) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.tdb2.sys.DatabaseOps.compact(DatabaseOps.java:261) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.tdb2.sys.DatabaseOps.compact(DatabaseOps.java:210) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.tdb2.DatabaseMgr.compact(DatabaseMgr.java:80) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.fuseki.ctl.ActionCompact$CompactTask.run(ActionCompact.java:109) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.fuseki.async.AsyncPool.lambda$submit$0(AsyncPool.java:66) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.fuseki.async.AsyncTask.call(AsyncTask.java:100) 
[fuseki-server.jar:4.3.2]

    at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
[?:?]
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
[?:?]

    at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: org.apache.thrift.protocol.TProtocolException: Unrecognized 
type 0
    at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:144) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:60) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.riot.thrift.wire.RDF_Term.standardSchemeReadValue(RDF_Term.java:433) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:224) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213) 
~[fuseki-server.jar:4.3.2]
    at org.apache.thrift.TUnion.read(TUnion.java:138) 
~[fuseki-server.jar:4.3.2]
    at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:82) 
~[fuseki-server.jar:4.3.2]

    ... 30 more
15:38:41 INFO  Server  :: [Task 1] finishes : Compact
15:40:43 INFO  Admin   :: [21] Tasks


Data-0002 folder was created with 2,5G of data (Data-0001 has 88G).


On 14/02/2022 21.30, Andy Seaborne wrote:

Yes, it's a good idea.

TDB2 in Fuseki has the "compact" operation to do this without stopping 
the server. It creates a new "Data-/" directory and you can delete 
lower numbered databases.


TDB1 - needs the server stopping for a rebuild. If you can stop 
updates, stop updates, 9server now read-only) backup, rebuild from 
backup then stop the server and swap the databases.


    Andy

On 14/02/2022 11:39, Mikael Pesonen wrote:


Hi,

we have now 13M triples and space usage of Jena data folder is 88G 
which seems high. This is not including text index.
Should we cleanup/compress/rebuild etc the database regularly in 
order to keep disk usage lower, or is this normal disk usage?


BR





Does Jena need maintainance regarding to disk space?

2022-02-14 Thread Mikael Pesonen



Hi,

we have now 13M triples and space usage of Jena data folder is 88G which 
seems high. This is not including text index.
Should we cleanup/compress/rebuild etc the database regularly in order 
to keep disk usage lower, or is this normal disk usage?


BR



Access control for federated queries

2021-12-10 Thread Mikael Pesonen



Hi,
is there a recommended way to handle access control for servers which 
are serving federated queries?
So is it possible to configure which servers may access which graphs or 
datasets, for example?





Fuseki GUI 400 error on all queries

2021-12-09 Thread Mikael Pesonen



I've got strange problem with one Fuseki setup. Web GUI is generating 
somehow incorrect queries resulting error 400. Endpoint in GUI is set to 
/fuseki/ds which works on other setups. Also sparql update works, error 
occurs only on all select etc queries. Looking from browser developer 
tools, query content-type is set to application/x-www-form-urlencoded. 
Could that be the reason? There is no other info on the error, just code 
400. Fuseki is a bit old version 3.16.





Escaped control characters in strings with Fuseki

2021-06-02 Thread Mikael Pesonen



Hi,

when posting turtle data with newlines, for example

lsr:82b9a3b5-6cba-4bd3-95c3-d309e8362c5e
        dcterms:title
                """2380
Bellow Control Valve"""@en .

it's returned for construct query in escaped form

lsr:82b9a3b5-6cba-4bd3-95c3-d309e8362c5e
   dcterms:title
    "2380\r\nBellow Control Valve"@en

Is it possible to get strings back in unescaped form? If not, is there a 
list of escaped characters available?


Re: NodeTableTRDF/Read exception

2021-05-28 Thread Mikael Pesonen



I should try some older Jena/Fuseki version?

On 28/05/2021 16.49, Andy Seaborne wrote:



On 28/05/2021 14:03, Mikael Pesonen wrote:


No this is the fresh db, started from empty today. And plenty of disk 
space.


So it's repeatble.

With no Minimal, Verifiable, Complete Example, it'll have to be an 
on-site investigation. Try different versions.


    Andy



On 28/05/2021 15.58, Andy Seaborne wrote:

Why are you adding data to a broken database?

On 28/05/2021 12:02, Mikael Pesonen wrote:


Actually now it happened again. Same size, about 80MB of turtle, 
imported without warnings this time, but reading the graph fails 
with this exception.


13:59:39 WARN  Fuseki  :: [44] RC = 500 : NodeTableTRDF/Read
org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:113) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter.next(Iter.java:1072) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:47) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:31) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:145) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:74) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterBlockTriplesStar.hasNextBinding(QueryIterBlockTriplesStar.java:54) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:101) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:101) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:101) 
~[fuseki-server.jar:3.17.0

Re: NodeTableTRDF/Read exception

2021-05-28 Thread Mikael Pesonen



No this is the fresh db, started from empty today. And plenty of disk space.

On 28/05/2021 15.58, Andy Seaborne wrote:

Why are you adding data to a broken database?

On 28/05/2021 12:02, Mikael Pesonen wrote:


Actually now it happened again. Same size, about 80MB of turtle, 
imported without warnings this time, but reading the graph fails with 
this exception.


13:59:39 WARN  Fuseki  :: [44] RC = 500 : NodeTableTRDF/Read
org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:113) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.atlas.iterator.Iter.next(Iter.java:1072) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:47) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:31) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:145) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:74) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterBlockTriplesStar.hasNextBinding(QueryIterBlockTriplesStar.java:54) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:101) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:101) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:101) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) 
~[fuseki-server.jar:3.17.0

Re: NodeTableTRDF/Read exception

2021-05-28 Thread Mikael Pesonen
]
    at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:773) 
[fuseki-server.jar:3.17.0]
    at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:905) 
[fuseki-server.jar:3.17.0]

    at java.lang.Thread.run(Thread.java:832) [?:?]
Caused by: org.apache.thrift.protocol.TProtocolException: Unrecognized 
type 0
    at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:144) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:60) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.riot.thrift.wire.RDF_Term.standardSchemeReadValue(RDF_Term.java:433) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:224) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213) 
~[fuseki-server.jar:3.17.0]
    at org.apache.thrift.TUnion.read(TUnion.java:138) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:82) 
~[fuseki-server.jar:3.17.0]

    ... 189 more
13:59:39 INFO  Fuseki  :: [44] 500 Server Error (48 ms)


On 27/05/2021 18.39, Andy Seaborne wrote:
See the thread following from Harri's message "TDBException: 
NodeTableTRDF/Read (vol. 2)"


    Andy


On 27/05/2021 15:56, Mikael Pesonen wrote:


Tried with the invalid data on fresh db and it didn't cause this 
exception. So root cause happened probably way earlier and is unknown.


On 20/05/2021 14.10, Andy Seaborne wrote:

Please can we have a complete, minimal example.

On 19/05/2021 11:12, Mikael Pesonen wrote:


More info on this. When the causes of the two warnings are fixed, 
same data is imported correctly and everything works.
So, when there are "too many" WARN level errors in importing data, 
the graph becomes corrupted.


Unlikely to be related to how many.

You wrote:
>> but many warnings on invalid data

not two.  What is the problem data?

    Andy


On 18/05/2021 18.02, Andy Seaborne wrote:



On 18/05/2021 13:03, Mikael Pesonen wrote:


This occurred again on another server. There were no errors 
before this, but many warnings on invalid data, if that is 
related. Now we get this error on all operations.


12:57:42 WARN  Fuseki  :: [line: 149803, col: 81] Bad 
IRI: <mailto:"Finskas> Code: 4/UNWISE_CHARACTER in PATH: The 
character matches no grammar rules of URIs/IRIs. These characters 
are permitted in RDF URI References, XML system identifiers, and 
XML Schema anyURIs.

...
14:48:28 WARN  Fuseki  :: [line: 475806, col: 80] Lexical 
form '' not valid for datatype XSD boolean

...



Most likely different issues - these are to do with your data 
(being read in?).


They don't appear related but you could try a minimal test case 
based on that data.


Another thing to investigate is to look at the earlier log entries 
for [24] and see if you can spot the RDF terms that are affected 
by comparing them to other incidents.


Maybe it is just one entry in the node table, or maybe not.

    Andy


14:52:06 WARN  Fuseki  :: [24] RC = 500 : NodeTableTRDF/Read
org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:112) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator

Re: NodeTableTRDF/Read exception

2021-05-27 Thread Mikael Pesonen



Tried with the invalid data on fresh db and it didn't cause this 
exception. So root cause happened probably way earlier and is unknown.


On 20/05/2021 14.10, Andy Seaborne wrote:

Please can we have a complete, minimal example.

On 19/05/2021 11:12, Mikael Pesonen wrote:


More info on this. When the causes of the two warnings are fixed, 
same data is imported correctly and everything works.
So, when there are "too many" WARN level errors in importing data, 
the graph becomes corrupted.


Unlikely to be related to how many.

You wrote:
>> but many warnings on invalid data

not two.  What is the problem data?

    Andy


On 18/05/2021 18.02, Andy Seaborne wrote:



On 18/05/2021 13:03, Mikael Pesonen wrote:


This occurred again on another server. There were no errors before 
this, but many warnings on invalid data, if that is related. Now we 
get this error on all operations.


12:57:42 WARN  Fuseki  :: [line: 149803, col: 81] Bad IRI: 
<mailto:"Finskas> Code: 4/UNWISE_CHARACTER in PATH: The character 
matches no grammar rules of URIs/IRIs. These characters are 
permitted in RDF URI References, XML system identifiers, and XML 
Schema anyURIs.

...
14:48:28 WARN  Fuseki  :: [line: 475806, col: 80] Lexical 
form '' not valid for datatype XSD boolean

...



Most likely different issues - these are to do with your data (being 
read in?).


They don't appear related but you could try a minimal test case 
based on that data.


Another thing to investigate is to look at the earlier log entries 
for [24] and see if you can spot the RDF terms that are affected by 
comparing them to other incidents.


Maybe it is just one entry in the node table, or maybe not.

    Andy


14:52:06 WARN  Fuseki  :: [24] RC = 500 : NodeTableTRDF/Read
org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:112) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter.next(Iter.java:1072) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:47) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:31) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:145) 
~[fuseki-s erver.jar:3.17.0]

...
 at 
org.apache.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:74) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.ResultSetCheckCondition.hasNext(ResultSetCheckCondition.java:55) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.executeQuery(SPARQLQueryProcessor.java:324) 
~[fuseki-server.jar:3.17.0]
 at 


...
Caused by: org.apache.thrift.protocol.TProtocolException: 
Unrecognized type 0
 at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:144) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.thrift.

Re: Construct Quads on Apache Jena Fuseki

2021-05-26 Thread Mikael Pesonen



Thanks, that works. Gave up too easily since Fuseki web GUI shows error 
"This line is invalid" for line "{ GRAPH ?g { ?s  ?p ?o } } ".


On 26/05/2021 17.24, Andy Seaborne wrote:

ARQ supports GRAPH in CONSTRUCT.

Need to use for a suitable MIME type for quads.

PREFIX : <http://example.com/>

CONSTRUCT
{ GRAPH ?g { ?s  ?p ?o } }
WHERE
{
  VALUES ?g { :g1 :g2 }
  GRAPH ?g { ?s  ?p ?o }
}

    Andy

On 26/05/2021 12:46, Mikael Pesonen wrote:


Hi,

sorry for replying to old thread. I would like to dump some graphs 
from corrupted db. Is there a way to dump them with one quad query 
instead of dumping each graph separately as triples?


On 03/12/2020 16.23, Ahmed Helal wrote:

Hi,

My query is:
CONSTRUCT {
   graph ?g
 {?s ?p ?o}
}
WHERE{
   GRAPH ?g { ?s ?p ?o }. values ?g {<http://example#v1>}
}

The interface shows an error/message for the second line saying 
(graph ?g): This line is invalid. Expected: VAR1, VAR2...


This is the returned results:
{
   "readyState": 4,
   "responseText": "",
   "status": 200,
   "statusText": "OK"
}


Thank you,
Ahmed.




From: Rob Vesse 
Sent: Thursday, December 3, 2020 4:00 AM
To: users@jena.apache.org 
Subject: Re: Construct Quads on Apache Jena Fuseki

The list does not permit attachments so we cannot see screenshots.  
Please provide the example query that is failing and the error/log 
message(s) you see in relation to the failure




Rob



From: Ahmed Helal 
Reply to: 
Date: Thursday, 3 December 2020 at 06:26
To: "users@jena.apache.org" 
Subject: Construct Quads on Apache Jena Fuseki



Hello everyone,



I am trying to construct Quads following the documentation on this 
page:


https://jena.apache.org/documentation/query/construct-quad.html



However, when I write my script on Apache Jena Fuseki, it complains. 
I have attached a screenshot below






Thank you,

Ahmed Helal.



Note: The script in the screenshot is just for the sake of the example








Re: Construct Quads on Apache Jena Fuseki

2021-05-26 Thread Mikael Pesonen



Hi,

sorry for replying to old thread. I would like to dump some graphs from 
corrupted db. Is there a way to dump them with one quad query instead of 
dumping each graph separately as triples?


On 03/12/2020 16.23, Ahmed Helal wrote:

Hi,

My query is:
CONSTRUCT {
   graph ?g
 {?s ?p ?o}
}
WHERE{
   GRAPH ?g { ?s ?p ?o }. values ?g {}
}

The interface shows an error/message for the second line saying (graph ?g): 
This line is invalid. Expected: VAR1, VAR2...

This is the returned results:
{
   "readyState": 4,
   "responseText": "",
   "status": 200,
   "statusText": "OK"
}


Thank you,
Ahmed.




From: Rob Vesse 
Sent: Thursday, December 3, 2020 4:00 AM
To: users@jena.apache.org 
Subject: Re: Construct Quads on Apache Jena Fuseki

The list does not permit attachments so we cannot see screenshots.  Please 
provide the example query that is failing and the error/log message(s) you see 
in relation to the failure



Rob



From: Ahmed Helal 
Reply to: 
Date: Thursday, 3 December 2020 at 06:26
To: "users@jena.apache.org" 
Subject: Construct Quads on Apache Jena Fuseki



Hello everyone,



I am trying to construct Quads following the documentation on this page:

https://jena.apache.org/documentation/query/construct-quad.html



However, when I write my script on Apache Jena Fuseki, it complains. I have 
attached a screenshot below





Thank you,

Ahmed Helal.



Note: The script in the screenshot is just for the sake of the example






Re: NodeTableTRDF/Read exception

2021-05-21 Thread Mikael Pesonen



Unfortunately the data is confidential. I can try to generate the same 
errors on reseted db if same thing happens.


On 20/05/2021 14.10, Andy Seaborne wrote:

Please can we have a complete, minimal example.

On 19/05/2021 11:12, Mikael Pesonen wrote:


More info on this. When the causes of the two warnings are fixed, 
same data is imported correctly and everything works.
So, when there are "too many" WARN level errors in importing data, 
the graph becomes corrupted.


Unlikely to be related to how many.

You wrote:
>> but many warnings on invalid data

not two.  What is the problem data?

    Andy


On 18/05/2021 18.02, Andy Seaborne wrote:



On 18/05/2021 13:03, Mikael Pesonen wrote:


This occurred again on another server. There were no errors before 
this, but many warnings on invalid data, if that is related. Now we 
get this error on all operations.


12:57:42 WARN  Fuseki  :: [line: 149803, col: 81] Bad IRI: 
<mailto:"Finskas> Code: 4/UNWISE_CHARACTER in PATH: The character 
matches no grammar rules of URIs/IRIs. These characters are 
permitted in RDF URI References, XML system identifiers, and XML 
Schema anyURIs.

...
14:48:28 WARN  Fuseki  :: [line: 475806, col: 80] Lexical 
form '' not valid for datatype XSD boolean

...



Most likely different issues - these are to do with your data (being 
read in?).


They don't appear related but you could try a minimal test case 
based on that data.


Another thing to investigate is to look at the earlier log entries 
for [24] and see if you can spot the RDF terms that are affected by 
comparing them to other incidents.


Maybe it is just one entry in the node table, or maybe not.

    Andy


14:52:06 WARN  Fuseki  :: [24] RC = 500 : NodeTableTRDF/Read
org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:112) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.Iter.next(Iter.java:1072) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:47) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:31) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:145) 
~[fuseki-s erver.jar:3.17.0]

...
 at 
org.apache.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:74) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.ResultSetCheckCondition.hasNext(ResultSetCheckCondition.java:55) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.executeQuery(SPARQLQueryProcessor.java:324) 
~[fuseki-server.jar:3.17.0]
 at 


...
Caused by: org.apache.thrift.protocol.TProtocolException: 
Unrecognized type 0
 at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:144) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProt

Re: NodeTableTRDF/Read exception

2021-05-19 Thread Mikael Pesonen



More info on this. When the causes of the two warnings are fixed, same 
data is imported correctly and everything works.
So, when there are "too many" WARN level errors in importing data, the 
graph becomes corrupted.


On 18/05/2021 18.02, Andy Seaborne wrote:



On 18/05/2021 13:03, Mikael Pesonen wrote:


This occurred again on another server. There were no errors before 
this, but many warnings on invalid data, if that is related. Now we 
get this error on all operations.


12:57:42 WARN  Fuseki  :: [line: 149803, col: 81] Bad IRI: 
<mailto:"Finskas> Code: 4/UNWISE_CHARACTER in PATH: The character 
matches no grammar rules of URIs/IRIs. These characters are permitted 
in RDF URI References, XML system identifiers, and XML Schema anyURIs.

...
14:48:28 WARN  Fuseki  :: [line: 475806, col: 80] Lexical 
form '' not valid for datatype XSD boolean

...



Most likely different issues - these are to do with your data (being 
read in?).


They don't appear related but you could try a minimal test case based 
on that data.


Another thing to investigate is to look at the earlier log entries for 
[24] and see if you can spot the RDF terms that are affected by 
comparing them to other incidents.


Maybe it is just one entry in the node table, or maybe not.

    Andy


14:52:06 WARN  Fuseki  :: [24] RC = 500 : NodeTableTRDF/Read
org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:112) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.atlas.iterator.Iter.next(Iter.java:1072) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:47) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:31) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:145) 
~[fuseki-s erver.jar:3.17.0]

...
 at 
org.apache.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:74) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.ResultSetCheckCondition.hasNext(ResultSetCheckCondition.java:55) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.executeQuery(SPARQLQueryProcessor.java:324) 
~[fuseki-server.jar:3.17.0]
 at 


...
Caused by: org.apache.thrift.protocol.TProtocolException: 
Unrecognized type 0
 at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:144) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:60) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.riot.thrift.wire.RDF_Term.standardSchemeReadValue(RDF_Term.java:433) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:224) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213) 
~[fus

Re: NodeTableTRDF/Read exception

2021-05-19 Thread Mikael Pesonen



This error occurs with all actions related to same graph. So I can't for 
example drop or clear the graph anymore.

Actions on other graphs work as usual.

On 18/05/2021 18.02, Andy Seaborne wrote:



On 18/05/2021 13:03, Mikael Pesonen wrote:


This occurred again on another server. There were no errors before 
this, but many warnings on invalid data, if that is related. Now we 
get this error on all operations.


12:57:42 WARN  Fuseki  :: [line: 149803, col: 81] Bad IRI: 
<mailto:"Finskas> Code: 4/UNWISE_CHARACTER in PATH: The character 
matches no grammar rules of URIs/IRIs. These characters are permitted 
in RDF URI References, XML system identifiers, and XML Schema anyURIs.

...
14:48:28 WARN  Fuseki  :: [line: 475806, col: 80] Lexical 
form '' not valid for datatype XSD boolean

...



Most likely different issues - these are to do with your data (being 
read in?).


They don't appear related but you could try a minimal test case based 
on that data.


Another thing to investigate is to look at the earlier log entries for 
[24] and see if you can spot the RDF terms that are affected by 
comparing them to other incidents.


Maybe it is just one entry in the node table, or maybe not.

    Andy


14:52:06 WARN  Fuseki  :: [24] RC = 500 : NodeTableTRDF/Read
org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:112) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:352) 
~[fuseki-server.jar:3.17.0]
 at org.apache.jena.atlas.iterator.Iter.next(Iter.java:1072) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:94) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:47) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.mem.TrackingTripleIterator.next(TrackingTripleIterator.java:31) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:145) 
~[fuseki-s erver.jar:3.17.0]

...
 at 
org.apache.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:74) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.sparql.engine.ResultSetCheckCondition.hasNext(ResultSetCheckCondition.java:55) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.executeQuery(SPARQLQueryProcessor.java:324) 
~[fuseki-server.jar:3.17.0]
 at 


...
Caused by: org.apache.thrift.protocol.TProtocolException: 
Unrecognized type 0
 at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:144) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:60) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.riot.thrift.wire.RDF_Term.standardSchemeReadValue(RDF_Term.java:433) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:224) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213) 
~[fuseki-server.jar:3.17.0]
 at org.apache.thrift.TUnion.read(TUnion.java:138) 
~[fuseki-server

Re: NodeTableTRDF/Read exception

2021-05-18 Thread Mikael Pesonen
.jar:3.17.0]
    at org.apache.thrift.TUnion.read(TUnion.java:138) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:82) 
~[fuseki-server.jar:3.17.0]

    ... 108 more
14:52:06 INFO  Fuseki  :: [24] 500 Server Error (12 ms)



On 27/04/2021 13.22, Andy Seaborne wrote:



On 27/04/2021 09:59, Mikael Pesonen wrote:


That's our guess too, but would be nice to have some idea where to 
look for the cause. Does Jena/Fuseki handle disk full situations 
without corruption?


It should do (the transaction aborts) which is why I was asking.

Bulk loading, other than loader "basic" which is safe - depends 
exactly when it happens in the process i.e. no guarantees.


    Andy




On 27/04/2021 11.56, Andy Seaborne wrote:

In the original message,


There was shortage of disk space, hope the db is not corrupted.


What happened?

This is the only thing you've mentioned that relates to update.

Everything else is "read" and the fault occurred at an earlier time 
or its an environmental factor (one mentioned file access 
permissions e.g. another process is interfering with files).


Apr 12 12:30:55 solid java[22910]: [2021-04-12 12:30:55] Fuseki 
WARN [346] RC = 500 : a fault occurred in a recent unsafe memory 
access operation in compiled Java code Apr 12 12:30:55 solid 
java[22910]: at 
org.apache.jena.dboe.base.buffer.RecordBuffer.compare(RecordBuffer.java:192) 
~[fuseki-server.jar:3.17.0] 


so JDK ByteBuffer.get failed bu works almost always. It is likely to 
be an environmental factor (the file system, background process 
messing around, hardware issue).


   Andy





--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Terms context

2021-04-29 Thread Mikael Pesonen



This lemon lexicon model could be useful: 
https://www.w3.org/2016/05/ontolex/


On 29/04/2021 15.27, Laura Morales wrote:

I have problems with the fact that, in English, words can have multiple 
meanings and can also be used as verbs, nouns, etc. In RDF, I feel like I'm 
compelled to define a term and its one meaning that is unique across the entire 
vocabulary. If I want to use the same term to mean two or more things, I have 
to use two dictionaries or I have to come up with weird combinations of 
multiple words. You know, like SimpleBeanFactoryAwareAspectInstanceFactory.
I was wondering if there is any way to define a term whose meaning depends on the 
context. For example Lorem.foobar and Ipsum.foobar, "foobar" could mean two 
entirely different things depending on whether it's a property of the type Lorem or type 
Ipsum. AFAIK OWL defines domains/ranges for terms, so maybe these can be used for this 
goal? What would be the practical implications, for example if I were to use Fuseki 
without an OWL reasoner (ie. just by loading a bunch of triples and start querying with 
SPARQL)?


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: NodeTableTRDF/Read exception

2021-04-27 Thread Mikael Pesonen



Server just ran out of disk space, but it's since been on low usage so 
could have happened few months ago. And logs are only for 1 month...


On 27/04/2021 13.22, Andy Seaborne wrote:



On 27/04/2021 09:59, Mikael Pesonen wrote:


That's our guess too, but would be nice to have some idea where to 
look for the cause. Does Jena/Fuseki handle disk full situations 
without corruption?


It should do (the transaction aborts) which is why I was asking.

Bulk loading, other than loader "basic" which is safe - depends 
exactly when it happens in the process i.e. no guarantees.


   Andy




On 27/04/2021 11.56, Andy Seaborne wrote:

In the original message,


There was shortage of disk space, hope the db is not corrupted.


What happened?

This is the only thing you've mentioned that relates to update.

Everything else is "read" and the fault occurred at an earlier time 
or its an environmental factor (one mentioned file access 
permissions e.g. another process is interfering with files).


Apr 12 12:30:55 solid java[22910]: [2021-04-12 12:30:55] Fuseki 
WARN [346] RC = 500 : a fault occurred in a recent unsafe memory 
access operation in compiled Java code Apr 12 12:30:55 solid 
java[22910]: at 
org.apache.jena.dboe.base.buffer.RecordBuffer.compare(RecordBuffer.java:192) 
~[fuseki-server.jar:3.17.0] 


so JDK ByteBuffer.get failed bu works almost always. It is likely to 
be an environmental factor (the file system, background process 
messing around, hardware issue).


   Andy





--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: NodeTableTRDF/Read exception

2021-04-27 Thread Mikael Pesonen



That's our guess too, but would be nice to have some idea where to look 
for the cause. Does Jena/Fuseki handle disk full situations without 
corruption?


On 27/04/2021 11.56, Andy Seaborne wrote:

In the original message,


There was shortage of disk space, hope the db is not corrupted.


What happened?

This is the only thing you've mentioned that relates to update.

Everything else is "read" and the fault occurred at an earlier time or 
its an environmental factor (one mentioned file access permissions 
e.g. another process is interfering with files).


Apr 12 12:30:55 solid java[22910]: [2021-04-12 12:30:55] Fuseki WARN  
[346] RC = 500 : a fault occurred in a recent unsafe memory access 
operation in compiled Java code Apr 12 12:30:55 solid 
java[22910]: at 
org.apache.jena.dboe.base.buffer.RecordBuffer.compare(RecordBuffer.java:192) 
~[fuseki-server.jar:3.17.0] 


so JDK ByteBuffer.get failed bu works almost always. It is likely to 
be an environmental factor (the file system, background process 
messing around, hardware issue).


   Andy



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: NodeTableTRDF/Read exception

2021-04-26 Thread Mikael Pesonen
(SessionHandler.java:1582) 
~[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) 
~[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1349) 
~[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) 
~[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:716) 
~[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) 
~[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.server.Server.handle(Server.java:516) 
~[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) 
~[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:556) 
[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) 
[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273) 
[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) 
[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105) 
[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) 
[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:773) 
[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:905) 
[fuseki-server.jar:3.17.0]
Apr 12 12:30:55 solid java[22910]: [2021-04-12 12:30:55] Fuseki INFO  
[346] 500 Server Error (26.663 s)


On 26/04/2021 13.24, Mikael Pesonen wrote:


There are few of these too (I'm grepping journalctl with 'fuseki'):

Mar 22 16:04:22 solid java[1398]: [2021-03-22 16:04:22] WebAppContext 
WARN  Failed startup of context o.e.j.w.WebAppContext@15dd5ac2{Apache 
Jena Fuseki Server,/,file:///opt/insight/jena/webapp/,UNAVAILABLE}
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.assembler.assemblers.AssemblerGroup$PlainAssemblerGroup.openBySpecificType(AssemblerGroup.java:165) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.assembler.assemblers.AssemblerGroup$PlainAssemblerGroup.open(AssemblerGroup.java:144) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.assembler.assemblers.AssemblerGroup$ExpandingAssemblerGroup.open(AssemblerGroup.java:93) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.assembler.assemblers.AssemblerBase.open(AssemblerBase.java:39) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.assembler.assemblers.AssemblerBase.open(AssemblerBase.java:35) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.fuseki.build.FusekiConfig.getDataset(FusekiConfig.java:687) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.fuseki.build.FusekiConfig.buildDataService(FusekiConfig.java:444) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.fuseki.build.FusekiConfig.buildDataAccessPoint(FusekiConfig.java:434) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.fuseki.webapp.FusekiWebapp.configFromTemplate(FusekiWebapp.java:323) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.fuseki.webapp.FusekiWebapp.initServerConfiguration(FusekiWebapp.java:252) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.fuseki.webapp.FusekiWebapp.initializeDataAccessPoints(FusekiWebapp.java:219) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.fuseki.webapp.FusekiServerListener.serverInitialization(FusekiServerListener.java:97) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.fuseki.webapp.FusekiServerListener.contextInitialized(FusekiServerListener.java:57) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398

Re: NodeTableTRDF/Read exception

2021-04-26 Thread Mikael Pesonen
:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) 
[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.eclipse.jetty.server.Server.start(Server.java:423) 
[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110) 
[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97) 
[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.eclipse.jetty.server.Server.doStart(Server.java:387) 
[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) 
[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.fuseki.cmd.JettyFusekiWebapp.start(JettyFusekiWebapp.java:125) 
[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.fuseki.cmd.FusekiCmd.runFuseki(FusekiCmd.java:376) 
[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.fuseki.cmd.FusekiCmd$FusekiCmdInner.exec(FusekiCmd.java:360) 
[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
jena.cmd.CmdMain.mainMethod(CmdMain.java:92) [fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
jena.cmd.CmdMain.mainRun(CmdMain.java:58) [fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
jena.cmd.CmdMain.mainRun(CmdMain.java:45) [fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.fuseki.cmd.FusekiCmd$FusekiCmdInner.innerMain(FusekiCmd.java:105) 
[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.fuseki.cmd.FusekiCmd.main(FusekiCmd.java:68) 
[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.tdb.base.file.LocationLock.checkLockFileForRead(LocationLock.java:276) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.tdb.base.file.LocationLock.getOwner(LocationLock.java:104) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.tdb.base.file.LocationLock.canObtain(LocationLock.java:139) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.tdb.StoreConnection._makeAndCache(StoreConnection.java:280) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.tdb.StoreConnection.make(StoreConnection.java:244) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.tdb.StoreConnection.make(StoreConnection.java:258) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.tdb.transaction.DatasetGraphTransaction.(DatasetGraphTransaction.java:69) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.tdb.sys.TDBMaker.createDirect(TDBMaker.java:126) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.tdb.sys.TDBMaker._create(TDBMaker.java:112) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.tdb.sys.TDBMaker.createDatasetGraphTransaction(TDBMaker.java:43) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.tdb.TDBFactory._createDatasetGraph(TDBFactory.java:93) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.tdb.TDBFactory.createDatasetGraph(TDBFactory.java:71) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.tdb.assembler.DatasetAssemblerTDB.make(DatasetAssemblerTDB.java:57) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.tdb.assembler.DatasetAssemblerTDB.createDataset(DatasetAssemblerTDB.java:48) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.sparql.core.assembler.DatasetAssembler.open(DatasetAssembler.java:43) 
~[fuseki-server.jar:3.17.0]
Mar 22 16:04:22 solid java[1398]: at 
org.apache.jena.assembler.assemblers.AssemblerGroup$PlainAssemblerGroup.openBySpecificType(AssemblerGroup.java:157) 
~[fuseki-server.jar:3.17.0]


On 26/04/2021 13.00, Andy Seaborne wrote:


On 26/04/2021 10:19, Mikael Pesonen wrote:


Is this correct one:


Well, it's not the one you showed in the initial message which was

Apr 23 13:51:09 solid java[6846]: [2021-04-23 13:51:09] Fuseki WARN  
[2] RC = 500 : NodeTableTRDF/Read


but the important question is:

Do you have the logs from when the update/disk full incident happened?

That seems like it is when the problem started. The rest are 
consequences.


   Andy



Apr 23 14:34:58 solid java[1153]: [2021-04-23 14:34:58] Fuseki

Re: Dumping RDF data as human readable table

2021-04-26 Thread Mikael Pesonen




On 24/04/2021 16.48, Bob DuCharme wrote:

I agree. What would this tool do with the following?

 :s :p1 "label1a" .
 :s :p1 "label1b" .
 :s :p2 "label2" .
 :s :p3 :s2 . :s2 p1 "label3"

For the sake of visualization it could be

   :p1                        :p2    :p3/:p1
:s    "label1a","label1b"   label2    label3

I agree that there is no point to make this table other that get a quick 
human readable view of the contents and empty "cells", which is the case 
here.





For a somewhat animated version of this issue as I understand it, 
showing why the ability to store data that doesn't fit well into 
tables is one of the great advantages of RDF, see 2:38 - 3:04 of 
https://www.youtube.com/watch?v=FvGndkpa4K0 .


Thanks,
Bob

On 4/22/21 5:45 AM, Martynas Jusevičius wrote:

OK, I think now I get a better idea of what you want to do.

It won't work in a general case because RDF resources can have
duplicate properties but CSV shouldn't have duplicate columns.


On Thu, Apr 22, 2021 at 11:41 AM Mikael Pesonen
 wrote:


I think that is for serializing the triplets as they are, in cleaner
format, but it doesn't say anything about rearranging the data.

On 22/04/2021 12.34, Martynas Jusevičius wrote:
This is probably what you want: 
https://www.w3.org/TR/sparql11-results-csv-tsv/


Try curl -H "Accept: text/csv" with SELECT results.

On Thu, Apr 22, 2021 at 11:30 AM Mikael Pesonen
 wrote:

Hi,

not exactly Jena related, but does anyone know if there is a tool or
sparql query that would read (almost) any kind of RDF data and make a
csv sheet where each column has predicates as headers and values as
cells. It would also be nice to get linked resources into the 
table the
same way. Even a hint how to make each predicate a new column in 
sparql

would be helpful.

So for example

:s :p1 "label1" .
:s :p2 "label2" .
:s :p3 :s2 . :s2 p1 "label3"

->
   :p1    :p2    :p3/:p1
:s    label2    label2    label3

Br,
Mikael


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's 
and Writer's Tools - Text Tools - E-books and M-books


Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: NodeTableTRDF/Read exception

2021-04-26 Thread Mikael Pesonen



Is this correct one:

Apr 23 14:34:58 solid java[1153]: [2021-04-23 14:34:58] Fuseki INFO  [1] 
500 Server Error (1.976 s)
Apr 23 15:30:27 solid systemd[1]: Configuration file 
/lib/systemd/system/apache-jena-fuseki.service is marked 
world-inaccessible. This has no effect as configuration data is 
accessible via APIs without restrictions. Proceeding anyway.
Apr 23 15:30:27 solid systemd[1]: Configuration file 
/lib/systemd/system/apache-jena-fuseki.path is marked 
world-inaccessible. This has no effect as configuration data is 
accessible via APIs without restrictions. Proceeding anyway.
Apr 23 15:30:34 solid systemd[1]: Configuration file 
/lib/systemd/system/apache-jena-fuseki.service is marked 
world-inaccessible. This has no effect as configuration data is 
accessible via APIs without restrictions. Proceeding anyway.
Apr 23 15:30:54 solid sudo[21457]:   **: TTY=pts/0 ; PWD=/home/**; 
USER=root ; COMMAND=/bin/cat /lib/systemd/system/apache-jena-fuseki.service
Apr 23 15:31:44 solid sudo[22157]:   **: TTY=pts/1 ; PWD=/home/**; 
USER=root ; COMMAND=/bin/cat /lib/systemd/system/apache-jena-fuseki.service


After that comes NodeTableTRDF/Read exception.



On 23/04/2021 21.11, Andy Seaborne wrote:



On 23/04/2021 14:57, Mikael Pesonen wrote:


Sorry here is the entire log


It's a different log. [2] is now [5]



Apr 23 16:23:49 solid java[1153]: [2021-04-23 16:23:49] Fuseki INFO  
[5] POST http://nimisampo.lingsoft.fi:3030/ds
Apr 23 16:23:49 solid java[1153]: [2021-04-23 16:23:49] Fuseki INFO  
[6] POST http://nimisampo.lingsoft.fi:3030/ds
Apr 23 16:23:49 solid java[1153]: [2021-04-23 16:23:49] Fuseki INFO  [5] 


Apr 23 16:23:49 solid java[1153]: [2021-04-23 16:23:49] Fuseki INFO  
[6] 200 OK (155 ms)



Apr 23 16:23:49 solid java[1153]: [2021-04-23 16:23:49] Fuseki WARN  
[5] RC = 500 : NodeTableTRDF/Read
Apr 23 16:23:49 solid java[1153]: at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]

...

~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:135) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:773) 
[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:905) 
[fuseki-server.jar:3.17.0]


This seems to be two stacktraces or printing the "caused by" seems to 
be wrong


TProtocolUtil:144 is an exception throw.

Regardless, the node table is broken.

Do you have the logs from when the update/disk full incident happened?

   Andy


Apr 23 16:23:49 solid java[1153]: at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:144) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:60) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.apache.jena.riot.thrift.wire.RDF_Term.standardSchemeReadValue(RDF_Term.java:433) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:224) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.apache.thrift.TUnion.read(TUnion.java:138) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:82) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: [2021-04-23 16:23:49] Fuseki INFO  
[5] 500 Server Error (568 ms)



On 23/04/2021 16.40, Andy Seaborne wrote:



On 23/04/2021 13:33, Mikael Pesonen wrote:


Hi,

we get this exception now when starting Jena and it's not loading 
any data. I have no idea what is changed on the server.


Apr 23 13:51:09 solid java[6846]: [2021-04-23 13:51:09] Fuseki WARN 
[2] RC = 500 : NodeTableTRDF/Read
Apr 23 13:51:09 solid java[6846]: 
org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
Apr 23 13:51:09 solid java[6846]: at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
Apr 23 13:51:09 solid java[6846]: at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]


TDBException wraps a Thrift exception which has the details.



Any idea what is going on? There was shortage of disk space, hope 
the db is not corrupted.


You would have got errors at update time.

   Andy





Service:

ExecStart=/usr/lib/jvm/java-8-oracle/bin/java -Xmx3000M -jar 
/opt/insight/jena/fuseki-server.jar --update --port 3030

Re: NodeTableTRDF/Read exception

2021-04-23 Thread Mikael Pesonen
util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:135) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:773) 
[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:905) 
[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:144) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:60) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.apache.jena.riot.thrift.wire.RDF_Term.standardSchemeReadValue(RDF_Term.java:433) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:224) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.apache.thrift.TUnion.read(TUnion.java:138) ~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:82) 
~[fuseki-server.jar:3.17.0]
Apr 23 16:23:49 solid java[1153]: [2021-04-23 16:23:49] Fuseki INFO  [5] 
500 Server Error (568 ms)



On 23/04/2021 16.40, Andy Seaborne wrote:



On 23/04/2021 13:33, Mikael Pesonen wrote:


Hi,

we get this exception now when starting Jena and it's not loading any 
data. I have no idea what is changed on the server.


Apr 23 13:51:09 solid java[6846]: [2021-04-23 13:51:09] Fuseki WARN  
[2] RC = 500 : NodeTableTRDF/Read
Apr 23 13:51:09 solid java[6846]: org.apache.jena.tdb2.TDBException: 
NodeTableTRDF/Read
Apr 23 13:51:09 solid java[6846]: at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
Apr 23 13:51:09 solid java[6846]: at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]


TDBException wraps a Thrift exception which has the details.



Any idea what is going on? There was shortage of disk space, hope the 
db is not corrupted.


You would have got errors at update time.

   Andy





Service:

ExecStart=/usr/lib/jvm/java-8-oracle/bin/java -Xmx3000M -jar 
/opt/insight/jena/fuseki-server.jar --update --port 3030 
--config=/opt/insight/jena/run/fuseki_config.ttl


Config:

PREFIX : <#>
PREFIX fuseki: <http://jena.apache.org/fuseki#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ja: <http://jena.hpl.hp.com/2005/11/Assembler#>

PREFIX tdb2: <http://jena.apache.org/2016/tdb#>
PREFIX text: <http://jena.apache.org/text#>

PREFIX lsrm: <https://resource.lingsoft.fi/ns/resource_meta#>

[] rdf:type fuseki:Server ;
fuseki:services (
:service
) .

:service rdf:type fuseki:Service ;
fuseki:name "/ds" ; # http://host:port/ds-ro
fuseki:serviceQuery "query" ; # SPARQL query service
fuseki:serviceQuery "sparql" ; # SPARQL query service
fuseki:serviceUpdate "update" ; # SPARQL update service
fuseki:serviceUpload "upload" ; # Non-SPARQL upload service
fuseki:serviceReadWriteGraphStore "data" ; # SPARQL Graph store 
protocol (read and write)

fuseki:dataset :text_dataset ;
.

:text_dataset rdf:type text:TextDataset ;
text:dataset :tdb2_dataset ;
text:index :indexLucene ;
.
:tdb2_dataset rdf:type tdb2:DatasetTDB2 ;
tdb2:location "/opt/insight/jena_data/" ;
.

# Text index description
:indexLucene a text:TextIndexLucene ;
text:directory  ;
text:entityMap :entMap ;
text:storeValues true ;
#text:analyzer [ a text:StandardAnalyzer ] ;
#text:queryAnalyzer [ a text:KeywordAnalyzer ] ;
#text:queryParser text:QueryParser ; # text:AnalyzingQueryParser
#text:multilingualSupport true ;
.

:entMap a text:EntityMap ;
text:entityField "uri" ;
## Must be defined in the text:map
text:defaultField "lsrm_lmz_title" ;
## Enable deleting of text index entries.
text:uidField "uid" ;
text:langField "lang" ;
text:graphField "graph" ;
text:map (
[ text:field "lsrm_lmz_title" ; text:predicate lsrm:lmz_title]
[ text:field "lsrm_lmz_content" ; text:predicate lsrm:lmz_content]
)
.



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



NodeTableTRDF/Read exception

2021-04-23 Thread Mikael Pesonen



Hi,

we get this exception now when starting Jena and it's not loading any 
data. I have no idea what is changed on the server.


Apr 23 13:51:09 solid java[6846]: [2021-04-23 13:51:09] Fuseki WARN  [2] 
RC = 500 : NodeTableTRDF/Read
Apr 23 13:51:09 solid java[6846]: org.apache.jena.tdb2.TDBException: 
NodeTableTRDF/Read
Apr 23 13:51:09 solid java[6846]: at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) 
~[fuseki-server.jar:3.17.0]
Apr 23 13:51:09 solid java[6846]: at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) 
~[fuseki-server.jar:3.17.0]


Any idea what is going on? There was shortage of disk space, hope the db 
is not corrupted.




Service:

ExecStart=/usr/lib/jvm/java-8-oracle/bin/java -Xmx3000M -jar 
/opt/insight/jena/fuseki-server.jar --update --port 3030 
--config=/opt/insight/jena/run/fuseki_config.ttl


Config:

PREFIX : <#>
PREFIX fuseki: 
PREFIX rdf: 
PREFIX rdfs: 
PREFIX ja: 

PREFIX tdb2: 
PREFIX text: 

PREFIX lsrm: 

[] rdf:type fuseki:Server ;
fuseki:services (
:service
) .

:service rdf:type fuseki:Service ;
fuseki:name "/ds" ; # http://host:port/ds-ro
fuseki:serviceQuery "query" ; # SPARQL query service
fuseki:serviceQuery "sparql" ; # SPARQL query service
fuseki:serviceUpdate "update" ; # SPARQL update service
fuseki:serviceUpload "upload" ; # Non-SPARQL upload service
fuseki:serviceReadWriteGraphStore "data" ; # SPARQL Graph store protocol 
(read and write)

fuseki:dataset :text_dataset ;
.

:text_dataset rdf:type text:TextDataset ;
text:dataset :tdb2_dataset ;
text:index :indexLucene ;
.
:tdb2_dataset rdf:type tdb2:DatasetTDB2 ;
tdb2:location "/opt/insight/jena_data/" ;
.

# Text index description
:indexLucene a text:TextIndexLucene ;
text:directory  ;
text:entityMap :entMap ;
text:storeValues true ;
#text:analyzer [ a text:StandardAnalyzer ] ;
#text:queryAnalyzer [ a text:KeywordAnalyzer ] ;
#text:queryParser text:QueryParser ; # text:AnalyzingQueryParser
#text:multilingualSupport true ;
.

:entMap a text:EntityMap ;
text:entityField "uri" ;
## Must be defined in the text:map
text:defaultField "lsrm_lmz_title" ;
## Enable deleting of text index entries.
text:uidField "uid" ;
text:langField "lang" ;
text:graphField "graph" ;
text:map (
[ text:field "lsrm_lmz_title" ; text:predicate lsrm:lmz_title]
[ text:field "lsrm_lmz_content" ; text:predicate lsrm:lmz_content]
)
.



Re: Dumping RDF data as human readable table

2021-04-22 Thread Mikael Pesonen



I think that is for serializing the triplets as they are, in cleaner 
format, but it doesn't say anything about rearranging the data.


On 22/04/2021 12.34, Martynas Jusevičius wrote:

This is probably what you want: https://www.w3.org/TR/sparql11-results-csv-tsv/

Try curl -H "Accept: text/csv" with SELECT results.

On Thu, Apr 22, 2021 at 11:30 AM Mikael Pesonen
 wrote:


Hi,

not exactly Jena related, but does anyone know if there is a tool or
sparql query that would read (almost) any kind of RDF data and make a
csv sheet where each column has predicates as headers and values as
cells. It would also be nice to get linked resources into the table the
same way. Even a hint how to make each predicate a new column in sparql
would be helpful.

So for example

:s :p1 "label1" .
:s :p2 "label2" .
:s :p3 :s2 . :s2 p1 "label3"

->
  :p1:p2:p3/:p1
:slabel2label2label3

Br,
Mikael



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Dumping RDF data as human readable table

2021-04-22 Thread Mikael Pesonen



Hi,

not exactly Jena related, but does anyone know if there is a tool or 
sparql query that would read (almost) any kind of RDF data and make a 
csv sheet where each column has predicates as headers and values as 
cells. It would also be nice to get linked resources into the table the 
same way. Even a hint how to make each predicate a new column in sparql 
would be helpful.


So for example

:s :p1 "label1" .
:s :p2 "label2" .
:s :p3 :s2 . :s2 p1 "label3"

->
        :p1        :p2        :p3/:p1
:s    label2    label2    label3

Br,
Mikael



Re: 'Failed startup of context'

2021-03-23 Thread Mikael Pesonen



That was it, existing lock file in data folder.

On 23/03/2021 10.33, Andy Seaborne wrote:
That stacktrace looks like two stacktraces sliced together.  Maybe 
overlap, maybe missing cause line, maybe from two different processes.


One is failing -

On 22/03/2021 16:01, Mikael Pesonen wrote:
We have updated Jena config from TDB to TDB2 and get error below the 
config. Identical config is working on another server with same setup...

...


---
 > sudo journalctl | grep useki


Started Apache Jena Fuseki.
[2021-03-22 17:46:45] Server INFO  Apache Jena Fuseki 3.17.0
[2021-03-22 17:46:45] WebAppContext WARN  Failed startup of context 
o.e.j.w.WebAppContext@15dd5ac2{Apache Jena Fuseki 
Server,/,file:///opt/insight/jena/webapp/,UNAVAILABLE}
 at 
org.apache.jena.assembler.assemblers.AssemblerGroup$PlainAssemblerGroup.openBySpecificType(AssemblerGroup.java:165) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.assembler.assemblers.AssemblerGroup$PlainAssemblerGroup.open(AssemblerGroup.java:144) 
~[fuseki-server.jar:3.17.0]

...

"Unable to check TDB lock owner for this location since the expected 
lock file is not a file/not readable. "


org.apache.jena.tdb.base.file.LocationLock.checkLockFileForRead(LocationLock.java:276) 
~[fuseki-server.jar:3.17.0]
 at 
org.apache.jena.tdb.base.file.LocationLock.getOwner(LocationLock.java:104) 
~[fuseki-server.jar:3.17.0]


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



'Failed startup of context'

2021-03-22 Thread Mikael Pesonen
We have updated Jena config from TDB to TDB2 and get error below the 
config. Identical config is working on another server with same setup...


PREFIX :    <#>
PREFIX fuseki:  
PREFIX rdf: 
PREFIX rdfs:    
PREFIX ja:  

PREFIX tdb2:    
PREFIX text:    

PREFIX lsrm: 

[] rdf:type fuseki:Server ;
   fuseki:services (
 :service
   ) .

:service rdf:type fuseki:Service ;
    fuseki:name "/ds" ;   # http://host:port/ds-ro
    fuseki:serviceQuery "query" ;    # SPARQL query service
    fuseki:serviceQuery "sparql" ;   # SPARQL query service
    fuseki:serviceUpdate    "update" ;   # SPARQL update service
    fuseki:serviceUpload    "upload" ;   # Non-SPARQL upload 
service
    fuseki:serviceReadWriteGraphStore "data" ; # SPARQL Graph store 
protocol (read and write)

    fuseki:dataset   :text_dataset ;
    .

:text_dataset rdf:type text:TextDataset ;
    text:dataset   :tdb2_dataset ;
    text:index :indexLucene ;
    .
:tdb2_dataset rdf:type  tdb2:DatasetTDB2 ;
    tdb2:location "/opt/insight/jena_data/" ;
    .

# Text index description
:indexLucene a text:TextIndexLucene ;
    text:directory   ;
    text:entityMap :entMap ;
    text:storeValues true ;
    #text:analyzer [ a text:StandardAnalyzer ] ;
    #text:queryAnalyzer [ a text:KeywordAnalyzer ] ;
    #text:queryParser text:QueryParser ; # text:AnalyzingQueryParser
    #text:multilingualSupport true ;
    .

:entMap a text:EntityMap ;
    text:entityField  "uri" ;
    ## Must be defined in the text:map
    text:defaultField "lsrm_lmz_title" ;
    ## Enable deleting of text index entries.
    text:uidField "uid" ;
    text:langField    "lang" ;
    text:graphField   "graph" ;
    text:map (
        [ text:field "lsrm_lmz_title" ; text:predicate lsrm:lmz_title]
        [ text:field "lsrm_lmz_content" ; text:predicate lsrm:lmz_content]
    )
    .


---
> sudo journalctl | grep useki


Started Apache Jena Fuseki.
[2021-03-22 17:46:45] Server INFO  Apache Jena Fuseki 3.17.0
[2021-03-22 17:46:45] WebAppContext WARN  Failed startup of context 
o.e.j.w.WebAppContext@15dd5ac2{Apache Jena Fuseki 
Server,/,file:///opt/insight/jena/webapp/,UNAVAILABLE}
    at 
org.apache.jena.assembler.assemblers.AssemblerGroup$PlainAssemblerGroup.openBySpecificType(AssemblerGroup.java:165) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.assembler.assemblers.AssemblerGroup$PlainAssemblerGroup.open(AssemblerGroup.java:144) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.assembler.assemblers.AssemblerGroup$ExpandingAssemblerGroup.open(AssemblerGroup.java:93) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.assembler.assemblers.AssemblerBase.open(AssemblerBase.java:39) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.assembler.assemblers.AssemblerBase.open(AssemblerBase.java:35) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.fuseki.build.FusekiConfig.getDataset(FusekiConfig.java:687) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.fuseki.build.FusekiConfig.buildDataService(FusekiConfig.java:444) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.fuseki.build.FusekiConfig.buildDataAccessPoint(FusekiConfig.java:434) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.fuseki.webapp.FusekiWebapp.configFromTemplate(FusekiWebapp.java:323) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.fuseki.webapp.FusekiWebapp.initServerConfiguration(FusekiWebapp.java:252) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.fuseki.webapp.FusekiWebapp.initializeDataAccessPoints(FusekiWebapp.java:219) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.fuseki.webapp.FusekiServerListener.serverInitialization(FusekiServerListener.java:97) 
~[fuseki-server.jar:3.17.0]
    at 
org.apache.jena.fuseki.webapp.FusekiServerListener.contextInitialized(FusekiServerListener.java:57) 
~[fuseki-server.jar:3.17.0]
    at 
org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:1068) 
~[fuseki-server.jar:3.17.0]
    at 
org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:572) 
~[fuseki-server.jar:3.17.0]
    at 
org.eclipse.jetty.server.handler.ContextHandler.contextInitialized(ContextHandler.java:997) 
~[fuseki-server.jar:3.17.0]
    at 
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:746) 
~[fuseki-server.jar:3.17.0]
    at 
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:379) 
~[fuseki-server.jar:3.17.0]
    at 

Re: Out of place: [EOF] while importing trig file

2021-03-15 Thread Mikael Pesonen



Looks like there is an accidental huge literal in the data. With 32G RAM 
import succeeded but number of quads became 45M. Number of lines in trig 
file is 75M.


On 12/03/2021 19.04, Andy Seaborne wrote:

Maybe an unclosed string literal or URI

Run through "riot" and see what the last parsed graph/triple is.

On 12/03/2021 15:17, Mikael Pesonen wrote:


Hi,

I'm trying to import 8G trig file which is dumped from Jena. However 
16G of RAM isn't enough so I'm trying to split the trig file and 
import in parts. First part is 2600 lines, but import fails:


ERROR Fuseki  :: [line: 2601, col: 1 ] Out of place: [EOF]

tail of first part is (quotes added)

"<http://data.finlex.fi/eli/sd/2014/610/luku/9/pykala/2/momentti/1/johdanto> 


 a <http://data.finlex.fi/schema/sfl/Preamble> ;
<http://data.europa.eu/eli/ontology#has_member>
<http://data.finlex.fi/eli/sd/2014/610/luku/9/pykala/2/momentti/1/johdanto/alkup> 
.


"

so it should be ok. What could cause the error? Btw is there a better 
way to copy data than using a trig dump file when it's too big for 
import?


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Out of place: [EOF] while importing trig file

2021-03-15 Thread Mikael Pesonen



Or maybe a better question, is it possible to stream into Jena TDB big 
amounts of data with limited RAM?


On 15/03/2021 12.22, Mikael Pesonen wrote:


Sorry how do I get riot? It's not in Jena/Fuseki bin folder.

On 12/03/2021 19.04, Andy Seaborne wrote:

Maybe an unclosed string literal or URI

Run through "riot" and see what the last parsed graph/triple is.

On 12/03/2021 15:17, Mikael Pesonen wrote:


Hi,

I'm trying to import 8G trig file which is dumped from Jena. However 
16G of RAM isn't enough so I'm trying to split the trig file and 
import in parts. First part is 2600 lines, but import fails:


ERROR Fuseki  :: [line: 2601, col: 1 ] Out of place: [EOF]

tail of first part is (quotes added)

"<http://data.finlex.fi/eli/sd/2014/610/luku/9/pykala/2/momentti/1/johdanto> 


 a <http://data.finlex.fi/schema/sfl/Preamble> ;
<http://data.europa.eu/eli/ontology#has_member>
<http://data.finlex.fi/eli/sd/2014/610/luku/9/pykala/2/momentti/1/johdanto/alkup> 
.


"

so it should be ok. What could cause the error? Btw is there a 
better way to copy data than using a trig dump file when it's too 
big for import?




--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



  1   2   3   4   5   >