Re: Querying URL with square brackets

2023-11-24 Thread Laura Morales
> If you want a page for every book, don't use fragment URIs. Use
> http://example.org/book/1 or http://example.org/book/1#this instead of
>  http://example.org/book#1.

yes yes I agree with this. I only tried to present an example of yet another 
"quirk" between raw data and browsers (where this kind of data is supposed to 
be used).


Re: Querying URL with square brackets

2023-11-24 Thread Laura Morales
> > in the case that I want to use these URLs with a web browser.
>
> I don't understand what the trouble with the above example is?

The problem with # is that browsers treat them as the start of a local 
reference. When you open http://example.org/book#1 the server only receives 
http://example.org/book. In other words it would be an error to create nodes 
for n different books (#1 #2 #3 #n) if my goal is also to use these URLs with a 
browser (for example if I want to show one page for every book). It's not a 
problem with Jena, it's a problem with the way browsers treat the fragment.


Re: Querying URL with square brackets

2023-11-24 Thread Laura Morales
> What do you mean by human-readable here? For large technical systems it's
> simply not feasible to encode meaning into the URI and I might even
> consider it an anti-pattern.

This is my problem. I do NOT want to encode any meaning into URLs, but I do 
want them to be human readable simply because I) properties are URLs too, 2) 
they can be used online, and 3) they are simpler to work with, for example 
editing in a Turtle file or writing a query.

:alice :knows :bobvs:dsa7hdsahdsa782j :d93ifg75jgueeywu 
:s93oeirugj290sjf

I can avoid [ entirely, but it rises the question of what other characters I 
MUST avoid.


Re: Querying URL with square brackets

2023-11-24 Thread Laura Morales
Thank you a lot. FILTER(STR(?id) = "...") works, as suggested by Andy. I do 
recognize though that it is a hack, and that URLs should probably not have a [.

But now I have trouble understanding UTF8 addresses. I would use random 
alphanumeric URLs everywhere if I could, or I would %-encode everything. But 
nodes IDs (URLs) are supposed to be valid, human-readable URLs because they're 
used online. Jena, and browsers, work fine with IRIs (which are UTF8), but the 
way special characters are used is not the same. For example it's perfectly 
fine in my graph to have a URL fragment, such as http://example.org/foo#bar but 
these URLs are not usable with a browser because the fragment is a local 
reference (local to the browser) that is not sent to the server. Which means in 
practice, that if I want to stay out of trouble I should not create a graph 
with IDs

http://example.org/book#1
http://example.org/book#2
http://example.org/book#3

in the case that I want to use these URLs with a web browser. Viceversa, 
browsers are perfectly fine with a [ in the path, but Jena is stricter.

So, if I want to use UTF8 addresses (IRIs) in my graph, and if I don't want to 
%-encode them because I want them to be human-readbale (also because they are 
much easier to read/edit manually), what is the list of characters that MUST be 
%-encoded?


> Sent: Friday, November 24, 2023 at 9:55 AM
> From: "Marco Neumann" 
> To: users@jena.apache.org
> Subject: Re: Querying URL with square brackets
>
> Laura, see jena issue #2102
> https://github.com/apache/jena/issues/2102
>
> Marco


Querying URL with square brackets

2023-11-23 Thread Laura Morales
I have a few URLs containing square brackets like http://example.org/foo[1]bar
I can create a TDB2 dataset without much problems, with warnings but no errors. 
I can also query these nodes "indirectly", that is if I query them by some 
property and not by URI. My problem is that I cannot query them directly by 
URI. As soon as I try to use the URIs explicitly in a query, for example 
"DESCRIBE ", I receive this error

ERROR SPARQL  :: [line: 1, col: 10] Bad IRI: 
'http://example.org/foo[1]bar':  Code: 
0/ILLEGAL_CHARACTER in PATH: The character violates the grammar rules for 
URIs/IRIs.

I tried escaping, "foo\[1\]bar" but it doesn't work.
I tried converting from a string, FILTER(?id = 
URI("http://example.org/foo[1]bar;)) but it doesn't work
What else could I try?


ML archive no longer available

2023-11-22 Thread Laura Morales
The "help and support"[1] page links to jena.markmail.org which unfortunately 
is no longer available (it has shut down).

[1] https://jena.apache.org/help_and_support/


Re: Implicit default-graph-uri

2023-11-18 Thread Laura Morales
I've tried this option too using the following configuration


fuseki:dataset [
a ja:RDFDataset;

ja:defaultGraph [
a ja:UnionModel ;

ja:subModel [
a tdb2:GraphTDB2 ;
tdb2:dataset [
a tdb2:DatasetTDB2 ;
tdb2:location "location1"
]
] ;

ja:subModel [
a tdb2:GraphTDB2 ;
tdb2:dataset [
a tdb2:DatasetTDB2 ;
tdb2:location "location2"
]
] ;
]
]


but it always gives me "transaction error" with any query. I've tried TDB 1 
instead, but it gives me a different error:

ERROR Server  :: Exception in initialization: the (group) Assembler 
org.apache.jena.assembler.assemblers.AssemblerGroup$PlainAssemblerGroup@b73433 
cannot construct the object [...] [ja:subModel of [...] [ja:defaultGraph of 
[...] ]] because it does not have an implementation for the objects's most 
specific type ja:Model

I've found a couple of old threads online with people reporting "MultiUnion" as 
working, but I don't know how to use this configuration. I couldn't find it on 
the Fuseki documentation and simply replacing ja:UnionModel for 
ja:MultiUnionModel doesn't make any difference for me.
Do you know anything about this MultiUnion and if it could work?




> Sent: Friday, November 17, 2023 at 8:47 PM
> From: "Andy Seaborne" 
> To: users@jena.apache.org
> Subject: Re: Implicit default-graph-uri
>
>
>
> On 16/11/2023 11:35, Laura Morales wrote:
> > I would like to configure Fuseki such that I can use 2 datasets from 2 
> > different locations, as if they were a single dataset.
> > This is my config.ttl:
> >
> >
> > <#> a fuseki:Service ;
> >
> >  fuseki:endpoint [
> >  fuseki:operation fuseki:query
> >  ] ;
> >
> >  fuseki:dataset [
> >  a ja:RDFDataset ;
> >
> >  ja:namedGraph [
> >  ja:graphName :graph1 ;
> >  ja:graph [
> >  a tdb2:GraphTDB ;
> >  tdb2:location "location-1" ;
> >  ]
> >  ] ;
> >
> >  ja:namedGraph [
> >  ja:graphName :graph2 ;
> >  ja:graph [
> >  a tdb2:GraphTDB ;
> >  tdb2:location "location-2" ;
> >  ]
> >  ] ;
> >  ] .
> >
> >
> > There is no particular reason why I used this configuration; I mostly 
> > copied it from the Fuseki documentation. If it can be simplified, please 
> > suggest how.
> >
> > I query Fuseki with 
> > "/service/query/?default-graph-uri=urn:x-arq:UnionGraph". I also know that 
> > I can use "SELECT FROM ". But I would like to know if 
> > I can configure this behavior as the default in the main configuration 
> > file, such that I can avoid using "x-arq:UnionGraph" entirely.
> > Both datasets are TDB2 and contain triples only in the default unnamed 
> > graph (in other words do not contain any named graph inside).
>
> I can't find a way to do that.
>
> tdb2:unionDefaultGraph applies to a single datasets and you have two
> datasets.
>
> Using
>ja:defaultGraph [
>  a ja:Model;
>  ja:subModel ...
>  ja:subModel ...
>  ] ;
>
> falls foul of transaction coordination across two different models (even
> if they are views of the same database).
>
> I though that would work - there is some attempt to extend transactions
> into graph but this seems to be pushing things too far.
>
>  Andy
>


Re: Implicit default-graph-uri

2023-11-16 Thread Laura Morales
I also tried tdb:unionDefaultGraph like this

fuseki:dataset [
a ja:RDFDataset ;
tdb2:unionDefaultGraph true ;
ja:namedGraph [ ... ] ;
ja:namedGraph [ ... ] ;
]

but it's not making any difference. I always get 0 triples when querying, 
unless I add x-arq:UnionGraph explicitly.


Implicit default-graph-uri

2023-11-16 Thread Laura Morales
I would like to configure Fuseki such that I can use 2 datasets from 2 
different locations, as if they were a single dataset.
This is my config.ttl:


<#> a fuseki:Service ;

fuseki:endpoint [
fuseki:operation fuseki:query
] ;

fuseki:dataset [
a ja:RDFDataset ;

ja:namedGraph [
ja:graphName :graph1 ;
ja:graph [
a tdb2:GraphTDB ;
tdb2:location "location-1" ;
]
] ;

ja:namedGraph [
ja:graphName :graph2 ;
ja:graph [
a tdb2:GraphTDB ;
tdb2:location "location-2" ;
]
] ;
] .


There is no particular reason why I used this configuration; I mostly copied it 
from the Fuseki documentation. If it can be simplified, please suggest how.

I query Fuseki with "/service/query/?default-graph-uri=urn:x-arq:UnionGraph". I 
also know that I can use "SELECT FROM ". But I would like 
to know if I can configure this behavior as the default in the main 
configuration file, such that I can avoid using "x-arq:UnionGraph" entirely.
Both datasets are TDB2 and contain triples only in the default unnamed graph 
(in other words do not contain any named graph inside).


Re: OOM Killed

2023-07-21 Thread Laura Morales
> Could you try Java17?

I did try Java 17 (default jre available from Debian) but I didn't notice any 
difference. I only ran 2 tests with Java 17 though, because the job takes a 
long time to fail/finish. I only noticed a significant difference when tweaking 
the GC options.

However I did another test on a 3rd PC, with the same exact software setup 
(Debian 12, Java 17, Fuseki 4.9.0) but completely different hardware (i5 
7th-gen 4C4T, 16GB RAM, NVMe PCIe-3 x4) and it completed in ~30m (instead of 
~2h) and the max RAM usage for the Java process was 11.3GB *without* tuning any 
GC or Fuseki options.

I don't know what to make of this. Maybe it's the hardware after all, and not 
Fuseki. Or maybe the GC has more time for doing its thing with a NVMe drive 
instead of SATA/USB. However 11GB still seems very high. As far as I know, each 
HTTP request should be short lived, and all the memory of every HTTP request 
should be freed (that's why it's strange that I see the memory grow over time).


Re: OOM Killed

2023-07-14 Thread Laura Morales
> Have you tried different garbage collectors?

WOAH I didn't even consider that before you mentioned it! I did this

JVM_ARGS="-XX:+UseSerialGC -Xmx4G" ./fuseki-server ...

and RAM usage of the java process peaked at 12GB

$ cat /proc/108344/status | grep VmHWM
VmHWM:  11916368 kB

Unfortunately I'm not at all familiar with Java garbage collectors. I don't 
understand why this option would use 1/3 less RAM than the default GC.
What other options are available for a more aggressive GC? I'm more interested 
in reducing RAM usage than raw query performance.


Re: OOM Killed

2023-07-12 Thread Laura Morales
Well, I think the OOM trigger that prompted Linux to kill Fuseki was the fact 
that I had a very small swap space (1GB). After adding a new swap partition 
(256GB) I don't see any errors anymore (OOM or heap space).

On my PC with 16GB RAM, it used 2-3GB of swap and took approximately the same 
amount of time to finish as in my previous tests.
On my PC with 8GB RAM, it used 9-10GB of swap and took significantly longer to 
finish, ~3h instead of ~2h.

The good news for me is, I guess, that I've found something that works for me. 
On the other hand I think there is a memory problem with Fuseki because it 
doesn't feel right when it's using that much RAM for processing read queries in 
series (not in parallel).
I would still love to know if there are options for forcing Fuseki into 
managing a given amount of RAM.


Re: CVE-2023-32200: Apache Jena: Exposure of execution in script engine expressions.

2023-07-11 Thread Laura Morales
Is there a demonstration of the exploit? I'd like to try it


> Sent: Tuesday, July 11, 2023 at 6:44 PM
> From: "Andy Seaborne" 
> To: annou...@apache.org, users@jena.apache.org
> Subject: CVE-2023-32200: Apache Jena: Exposure of execution in script engine 
> expressions.
>
> Severity: important
>
> Affected versions:
>
> - Apache Jena 3.7.0 through 4.8.0
>
> Description:
>
> There is insufficient restrictions of called script functions in Apache Jena
>  versions 4.8.0 and earlier. It allows a
> remote user to execute javascript via a SPARQL query.
> This issue affects Apache Jena: from 3.7.0 through 4.8.0.
>
> Credit:
>
> s3gundo of Alibaba (reporter)
>
> References:
>
> https://www.cve.org/CVERecord?id=CVE-2023-22665
> https://jena.apache.org/
> https://www.cve.org/CVERecord?id=CVE-2023-32200
>
>


Re: OOM Killed

2023-07-11 Thread Laura Morales
I've started Fuseki

./fuseki-server --loc=database --localhost --port=7000 /query

and then I've started my script, with 16 threads (each one making a request to 
Fuseki). I did not set -Xmx, but I saw in htop that 4GB was chosen 
automatically. I did this on my PC with 16GB RAM, 2C4T CPU. When I was seeing 
Fuseki using around 12GB of memory, I would pause my script, restart Fuseki, 
and resume my script. I did this a couple of times and the job completed 
without issues (no OOM and no crashes). It took ~2h.

I don't know if this adds any useful information.


Re: OOM Killed

2023-07-11 Thread Laura Morales
led
>
> Laura, Dave,
> 
> This doesn't sound like the same issue but let's see.
> 
> Dave - your situation isn't under high load is it?
> 
> - Is it in a container? If so:
>Is it the container being killed OOM or
>  Java throwing an OOM exception?
>Much RAM does the container get? How many threads?
> 
> - If not a container, how many CPU Threads are there? How many cores?
> 
> - Which form of Fuseki are you using?
> 
> what does
>java -XX:+PrintFlagsFinal -version \
> | grep -i 'M..HeapSize`
> 
> say?
> 
> How are you sending the queries to the server?
> 
> On 09/07/2023 20:33, Laura Morales wrote:
> > I'm running a job that is submitting a lot of queries to a Fuseki server, 
> > in parallel. My problem is that Fuseki is OOM-killed and I don't know how 
> > to fix this. Some details:
> > 
> > - Fuseki is queried as fast as possible. Queries take around 50-100ms to 
> > complete so I think it's serving 10s of queries each second
> 
> Are all the queries about the same amount of work are are some going to 
> cause significantly more memory use?
> 
> It is quite possible to send queries faster than the server can process 
> them - there is little point sending in parallel more than there are 
> real CPU threads to service them.
> 
> They will interfere and the machine can end up going slower (query of 
> queries per second).
> 
> I don't know exactly the impact on the GC but I think the JVM delays 
> minor GC's when very busy but that pushes it to do major ones earlier.
> 
> A thing to try is use less parallelism.
> 
> > - Fuseki 4.8. OS is Debian 12 (minimal installation with only OS, Fuseki, 
> > no desktop environments, uses only ~100MB of RAM)
> > - all the queries are read queries. No updates, inserts, or other write 
> > queries
> > - all the queries are over HTTP to the Fuseki endpoint
> > - database is TDB2 (created with tdb2.tdbloader)
> > - database contains around 2.5M triples
> > - the machine has 8GB RAM. I've tried on another PC with 16GB and it 
> > completes the job. On 8GB though, it won't
> > - with -Xmx6G it's killed earlier. With -Xmx2G it's killed later. Either 
> > way it's always killed.
> 
> Is it getting OOM at random or do certain queries tend to push it over 
> he edge?
> 
> Is that the machine (container) has 8G RAM and there is no -Xmx setting? 
> in that case, default setting applies which is 25% of RAM.
> 
> A heap dump to know where the memory is going would be useful.
> 
> > Is there anything that I can tweak to avoid Fuseki getting killed? 
> > Something that isn't "just buy more RAM".
> > Thank you
>


Re: OOM Killed

2023-07-09 Thread Laura Morales
- database contains around 2.5M triples and is ~4GB in size on disk


OOM Killed

2023-07-09 Thread Laura Morales
I'm running a job that is submitting a lot of queries to a Fuseki server, in 
parallel. My problem is that Fuseki is OOM-killed and I don't know how to fix 
this. Some details:

- Fuseki is queried as fast as possible. Queries take around 50-100ms to 
complete so I think it's serving 10s of queries each second
- Fuseki 4.8. OS is Debian 12 (minimal installation with only OS, Fuseki, no 
desktop environments, uses only ~100MB of RAM)
- all the queries are read queries. No updates, inserts, or other write queries
- all the queries are over HTTP to the Fuseki endpoint
- database is TDB2 (created with tdb2.tdbloader)
- database contains around 2.5M triples
- the machine has 8GB RAM. I've tried on another PC with 16GB and it completes 
the job. On 8GB though, it won't
- with -Xmx6G it's killed earlier. With -Xmx2G it's killed later. Either way 
it's always killed.

Is there anything that I can tweak to avoid Fuseki getting killed? Something 
that isn't "just buy more RAM".
Thank you


Filter nodes by type

2021-09-27 Thread Laura Morales
This query returns two types of nodes

SELECT ?node
WHERE { ?node ?prop ?val }

{'node': {'type': 'uri', 'value': ''}}
{'node': {'type': 'triple', 'value': '' }}

Is it possible to filter by type (I need only type=uri)?


Re: Faster TDB2 build?

2021-09-12 Thread Laura Morales
Just a personal curiosity... are you building it on a SSD or HDD? What is your 
"triples loaded per second" rate?


> Sent: Sunday, September 12, 2021 at 2:39 AM
> From: "Cristóbal Miranda" 
> To: users@jena.apache.org
> Subject: Faster TDB2 build?
>
> Hi,
> 
> I'm running tdb2.tdbloader on Wikidata, but it's
> taking too long, now it's on day 11 and still indexing,
> whereas tdbloader2 (for TDB) didn't take as much for me.
> I was wondering if something could be done to allow
> more space on RAM for the build phase in order to be faster,
> for example passing a memory budget parameter to the
> loader. Not sure exactly how the extra RAM space would be
> used, but I was thinking that maybe if more b+tree blocks
> were kept in RAM this processing would be faster, for
> example keeping 2 upper levels of the tree in primary memory,
> or even everything in there if the given budget allowed it.
> 
> What would it take to implement such a feature? maybe in a
> tdb2.tdbloader2? I was looking at the code for a way to do something
> but couldn't find an easy modification to achieve this.
>


Re: riot --base option does not work with single letter URI schema

2021-09-11 Thread Laura Morales
> It looks like a MS Windows drive letter, which then isn't handled very
> gracefully.

I'm not using Windows though.


Re: riot --base option does not work with single letter URI schema

2021-09-11 Thread Laura Morales
> What version of Jena are you using? They all seem to work for me with 3.15.0

$ riot --version
Jena:   VERSION: 4.1.0
Jena:   BUILD_DATE: 2021-05-31T20:32:25+


Re: riot --base option does not work with single letter URI schema

2021-09-11 Thread Laura Morales
No no I mean "schema" indeed. Let me give a better example.

These work:
riot --base="isbn:" --syntax=ttl --output=nt <( echo "<0123> 
  ." )
riot --syntax=ttl --output=nt <( echo "base  <0123> 
  ." )
riot --syntax=ttl --output=nt <( echo "base  <0123> 
  ." )

This doesn't:
riot --base="n:" --syntax=ttl --output=nt <( echo "<0123> 
  ." )


> Sent: Saturday, September 11, 2021 at 9:15 AM
> From: "Martynas Jusevičius" 
> To: users@jena.apache.org
> Subject: Re: riot --base option does not work with single letter URI schema
>
> You probably mean “prefix” not “schema”?
> 
> And the result should be
> 
> x:alice x:knows x:bob
> 
> Prefixed URIs don’t get the <> brackets.


riot --base option does not work with single letter URI schema

2021-09-10 Thread Laura Morales
This command

riot --base="x:" --syntax=ttl --output=nt <( echo "   ." 
)

returns this triple

   
.

instead of

   .

Riot version 4.1.0.


sparql --loc=

2021-09-04 Thread Laura Morales
Is there a reason why the "sparql" command in Jena (4.1.0) does not support the 
--loc option? Or is it just not implemented?


Re: Fuseki: how can I inject standard prefixes to every query?

2021-09-03 Thread Laura Morales
What I mean is really simple. Basically I'd like to send queries like this to 
Fuseki:

SELECT ?l
WHERE { ?s rdfs:label ?l }

this is not a valid query because it does not define the rdfs prefix. But I'd 
like to prepend some standard prefixes to every incoming request before they 
are executed. So the request above is transformed into this before being 
executed:

PREFIX rdfs: 
SELECT ?l
WHERE { ?s rdfs:label ?l }

Now I've made an example with rdfs and I hope it's simple enough and that it 
makes sense. In practice, I'd like to have a list of prefixes, even custom 
ones. My endgoal is that I can send queries without having to define any PREFIX 
because they are automatically attached by the server, at every request. If I 
need any particular prefix, or if I need to override one of the default 
prefixes, than those will be specified in the specific query.

Hope it makes sense, thank you.




> Sent: Friday, September 03, 2021 at 8:35 AM
> From: "Michael Wechner" 
> To: users@jena.apache.org
> Subject: Re: Fuseki: how can I inject standard prefixes to every query?
>
> can you give an example?
>
> Thanks
>
> Michael


Fuseki: how can I inject standard prefixes to every query?

2021-09-02 Thread Laura Morales
The idea is to prepend some standard prefixes at every incoming query, before 
it's executed, such that I don't have to write them every time.


Fuseki web UI not showing Table results when using RDF-star

2021-09-02 Thread Laura Morales
I've just noticed that after inserting this kind of data

:alice :knows :bob {| :since 2010 |}

the web UI does not show any results in the Table tab when a query returns a 
"type":"triple".


Re: In-memory Fuseki keeps growing memory indefinitely even if idle

2021-07-27 Thread Laura Morales
I can actually confirm that this was happening to me as well on a VM with 4GB 
RAM and TDB store (thus not in-memory). As more and more queries were made, RAM 
filled up, then the SWAP filled up, and at that point I had to reboot the 
machine. My solution was just to restart the server with a cronjob.


> Sent: Tuesday, July 27, 2021 at 3:19 PM
> From: "Marco Fiocco" 
> To: users@jena.apache.org
> Subject: In-memory Fuseki keeps growing memory indefinitely even if idle
>
> Hello,
>
> I'm running a in-memory Fuseki 3.16 server and I see that the allocated 
> memory keeps growing linearly indefinitely even if idle.
> Initially I reserved 1GB of memory and I've noticed that the process gets OOM 
> killed every 2 hours. Now I've allocated 2GB because I've read somewhere that 
> 2GB is the minimum for Java heaps. Is that true?
> I'm waiting to see if it will get again.
> Is this a bug or there is a better way to config it?
>
> My Fuseki config is:
>
> @prefix fuseki:   .
> @prefix rdf:  .
> @prefix rdfs: .
> @prefix tdb:  .
> @prefix ja:   .
> @prefix :<#> .
>
> [] rdf:type fuseki:Server .
>
> <#service> rdf:type fuseki:Service ;
> rdfs:label  "Dataset with SHACL validation" ;
> fuseki:name "ds" ;
> # See the 
> endpoint url in build.gradle
> fuseki:serviceReadWriteGraphStore "data" ;
> # SPARQL Graph store 
> protocol (read and write)
> fuseki:endpoint   [ fuseki:operation fuseki:query ;   fuseki:name 
> "sparql"  ] ;   # SPARQL query service
> fuseki:endpoint   [ fuseki:operation fuseki:shacl ;   fuseki:name 
> "shacl" ] ; # SHACL query service
> fuseki:dataset  <#dataset> .
>
> ## In memory TDB with union graph.
> <#dataset> rdf:type   tdb:DatasetTDB ;
>   tdb:location "--mem--" ;
>   # Query timeout on this dataset (1s, 1000 milliseconds)
>   ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue "1000" ] ;
>   # Make the default graph be the union of all named graphs.
>   tdb:unionDefaultGraph true .
>
> Thanks
> Marco
>


Re: Does Jena use duck typing?

2021-07-25 Thread Laura Morales
What I mean is... (with an example):
As far as I understand, these are valid triples that I can load into 
Fuseki/Jena at the same time in the same graph

:alice   :age "25"^^xsd:string;
:bob :age "20"^^xsd:integer;

If then I execute this query:

SELECT *
WHERE
{
?s :age ?age .
FILTER (?age < 30)
}

should Jena raise an error (because there is a :age property with a value that 
is not integer), or should it simply ignore :alice entirely because the type of 
the property (string) doesn't match the type that I'm querying for?




> Sent: Monday, July 19, 2021 at 12:40 PM
> From: "Andy Seaborne" 
> To: users@jena.apache.org
> Subject: Re: Does Jena use duck typing?
>
> It store triples/quads.
> Think of it as a table of triples and table of quads
>
> Comparison is defined by XQuery/XPath Functions and Operators.
>
> But maybe I don't understand what's behind the question.
>
> On 19/07/2021 07:02, Laura Morales wrote:
> > How is Jena able to index and search/compare properties with different data 
> > types?
> > For example if I have this graph
> >
> >  :alice :foobar "2021-07-16"^^xsd:date;
> >  :alice :foobar "foobar"^^xsd:string;
> >  :alice :foobar "42"^^xsd:integer;
>
> You can't compare "2021-07-16"^^xsd:date with an xsd:string or an
> xsd:integer.
>
> They have different value spaces.
>
>  Andy


RDF* multiple edges

2021-07-19 Thread Laura Morales
In RDF*, would these two be considered different edges or one single edge?

:alice :lives :NYC {| :from 2000; :to 2002 |}
:alice :lives :NYC {| :from 2010; :to 2020 |}


Re: Does Jena use duck typing?

2021-07-19 Thread Laura Morales
How is Jena able to index and search/compare properties with different data 
types?
For example if I have this graph

:alice :foobar "2021-07-16"^^xsd:date;
:alice :foobar "foobar"^^xsd:string;
:alice :foobar "42"^^xsd:integer;




> Sent: Friday, July 16, 2021 at 10:06 AM
> From: "Andy Seaborne" 
> To: users@jena.apache.org
> Subject: Re: Does Jena use duck typing?
>
> Literals are always datatyped in RDF. No guessing.
>
>
>
> There syntax conveniences:
>
> "abc" is the same as writing "abc"^^xsd:string.
>
> The specs say to prefer writing output without ^^xsd:string.
>
> In Turtle and related syntaxes:
>
> 42 is an xsd:integer == "42"^^xsd:integer
>
> 42.99 is an xsd:decimal
>
> 42e0 is an xsd;double.
>
> "2021-07-16" is string.
> "2021-07-16"^^xsd;date is a date.
>
> and language strings "abc"@en have datatype rdf:langString.
>
> On 16/07/2021 08:35, Laura Morales wrote:
> > When I insert new triples into a Jena/Fuseki store, are *all* the quoted 
> > literals treated as strings by default unless I specify the type explicitly 
> > (eg. xsd:dateTime)? Or does Jena use duck typing to determine the best type 
> > fit for storing the value?
> > What about numbers instead? Will Jena store 42 as an xsd:integer and 42.99 
> > as xsd:double if I don't explicitly write the type?
> >
> > How can I specify a set of constraints in Fuseki for all the properties of 
> > my model? For example "this property is a double, with range [1.0 .. 2.0]" 
> > (the same way that I can specify constraints on Postgres for example)?
>
> Using ontology/schema/shapes.
>
>  Andy
>


Does Jena use duck typing?

2021-07-16 Thread Laura Morales
When I insert new triples into a Jena/Fuseki store, are *all* the quoted 
literals treated as strings by default unless I specify the type explicitly 
(eg. xsd:dateTime)? Or does Jena use duck typing to determine the best type fit 
for storing the value?
What about numbers instead? Will Jena store 42 as an xsd:integer and 42.99 as 
xsd:double if I don't explicitly write the type?

How can I specify a set of constraints in Fuseki for all the properties of my 
model? For example "this property is a double, with range [1.0 .. 2.0]" (the 
same way that I can specify constraints on Postgres for example)?


Re: Terms context

2021-04-29 Thread Laura Morales
> Can you use subproperties?
>
> One point of RDF is to avoid collisions do data can be merged safely.

I don't think subproperties are the answer. My problem is that the meaning of a 
property is defined within the boundaries of the whole vocabulary, whereas I'd 
like the meaning to be defined within the boundaries of its "neighborhood" 
context. If it makes sense.
For example I could have two classes, RealFriend.address and 
VirtualFriend.address. Here I'd like "address" to mean 2 different things 
(physical address, email address) depending on the local context, ie. 
RealFriend vs VirtualFriend. Of course I can use multiple vocabularies, or I 
can rename them phys_address/email_address; but I hope the example is not too 
trivial and that you can understand what I mean.


Terms context

2021-04-29 Thread Laura Morales
I have problems with the fact that, in English, words can have multiple 
meanings and can also be used as verbs, nouns, etc. In RDF, I feel like I'm 
compelled to define a term and its one meaning that is unique across the entire 
vocabulary. If I want to use the same term to mean two or more things, I have 
to use two dictionaries or I have to come up with weird combinations of 
multiple words. You know, like SimpleBeanFactoryAwareAspectInstanceFactory.
I was wondering if there is any way to define a term whose meaning depends on 
the context. For example Lorem.foobar and Ipsum.foobar, "foobar" could mean two 
entirely different things depending on whether it's a property of the type 
Lorem or type Ipsum. AFAIK OWL defines domains/ranges for terms, so maybe these 
can be used for this goal? What would be the practical implications, for 
example if I were to use Fuseki without an OWL reasoner (ie. just by loading a 
bunch of triples and start querying with SPARQL)?


[RDF*] How to model multiple uses of relations

2021-01-03 Thread Laura Morales
In property graphs it's possible to use a relation multiple times, for example

Foobar -[president_of {from: 1950, to:1954}]-> Japan
Foobar -[president_of {from: 1962, to:1966}]-> Japan

where "from" and "to" are to properties of the "president_of" relation. This is 
an old problem that has always remained impossible to translate to RDF. In RDF 
there is only one relation, one "link" from a node to another. There cannot be 
2 different relations with the same name.
I wonder, does RDF* change anything in regard to this behavior? I guess it does 
not but... I'd still like to ask anyway. For example the following Turtle* will 
not achieve that, right?

<< :foobar :president_of :japan >> from: 1950 ; to: 1954 .
<< :foobar :president_of :japan >> from: 1962 ; to: 1966 .

:president_of is always the same one relation, correct?



Is it possible to use UTF8 IRIs in Turtle?

2020-12-30 Thread Laura Morales
Is there a way to write UTF8 IRIs with Turtle without all the %-encoded 
characters? I mean like this  or  or ex:"alice 
smith"? The only way that I know to write those characters is like this 
, ie. by writing the encoded URI myself. Is there any syntax 
that I can use to write UTF8 characters instead, and have those characters 
automatically be parsed as IRIs? Like when I type a string in my browser, I 
type UTF8 but it's automatically url-encoded to a URL?


Re: Turtle* same term twice

2020-12-21 Thread Laura Morales
Everything is clear, thank you. But can I just say, that {| |} is so ugly. 
Since the spec is still WIP, is there any chance that it could be changed to 
something else? A 1-character symbol maybe?




> Sent: Monday, December 21, 2020 at 1:00 PM
> From: "Andy Seaborne" 
> To: users@jena.apache.org
> Subject: Re: Turtle* same term twice
>
>
>
> On 21/12/2020 11:34, Laura Morales wrote:
> > Pardon, a couple of things that are not completely clear to me:
> >
> > - is:a :b :c {| :d "object" |}a valid Turtle syntax? I've never 
> > seen it before
>
> Yes - it is the new annotation syntax for RDF*
>
> See the link in previous email to the RDF-start community test cases.
>
> >
> > - if I load this with Fuseki/Jena<< :a :b :c >> :d :e .will Jena 
> > automatically create the :a :c :c triple? This is important for me to know, 
> > or if there is a switch to enable this behavior
>
> No, it will not create the triple ":a :b :c"
>
>  << :a :b :c >> :d :e .
>
> is one triple.
>
> subject = << :a :b :c >>
> predicate = :d
> object = :e
>
> Simply write:
>
> :a :b :c .
> << :a :b :c >> :d :e .
>
> or use
>
> :a :b :c {| :d :e |}
>
> (the latter is not in 3.17.0)
>
> This is different to the original paper. This is the same as RDF-star
> community specs at the moment.
>
> Annotation syntax arose to make it convenient to assert and annotate at
> the same time.
>
> Always asserting ":a :b :c" when <<>> is used is limiting;
>
> << :a :b :c >> :withdrawn "2020-12-31" .
>
> is impossible because :a :b :c would still be there. i.e. you can't talk
> about a triple without it being "true" - true means
> graph contains (:a :b :c)
>
> Some of the use cases :
> https://w3c.github.io/rdf-star/UCR/rdf-star-ucr.html
>
>  Andy


Re: Turtle* same term twice

2020-12-21 Thread Laura Morales
Pardon, a couple of things that are not completely clear to me:

- is:a :b :c {| :d "object" |}a valid Turtle syntax? I've never seen it 
before

- if I load this with Fuseki/Jena<< :a :b :c >> :d :e .will Jena 
automatically create the :a :c :c triple? This is important for me to know, or 
if there is a switch to enable this behavior

Otherwise great explanation as always, thank you Andy.




> Sent: Monday, December 21, 2020 at 12:21 PM
> From: "Andy Seaborne" 
> To: users@jena.apache.org
> Subject: Re: Turtle* same term twice
>
> 
> 
> On 21/12/2020 07:47, Lorenz Buehmann wrote:
> > 
> > On 20.12.20 17:19, Andy Seaborne wrote:
> >>
> >>
> >> On 20/12/2020 09:20, Lorenz Buehmann wrote:
> >>>
> >>> On 19.12.20 21:14, Laura Morales wrote:
> >>>> Is this
> >>>>
> >>>>   << :a :b :c >> :d ; :e .
> >>>>
> >>>> the equivalent of this?
> >>>>
> >>>>   << :a :b :c >> :d .
> >>>>   << :a :b :c >> :e .
> >>>>
> >>>> Will Fuseki/Jena store them and treat them the same exact way?
> >>> https://w3c.github.io/rdf-star/rdf-star-cg-spec.html#turtle-star-grammar
> >>
> >> Yes - there is no changes to Turtle except to add <<>> as a new kind
> >> of RDF term. For syntax, the new annotation syntax (in issue 9) is
> >> likely to happen and is a way to write <<>> and assert the triple in
> >> one form.
> > 
> > Yep - something that I think might be confusing for people start using
> > RDF* might be the fact that
> > 
> > << :a :b :c >> :d .
>  >
> > is just an annotation but doesn't add the triple itself.
> 
> << :a :b :c >> :d "object" .
> 
> is a triple. It is a triple about another ":a :b :c" These <<>> things 
> behave like literals in the sense that their representation tells you 
> everything you need to know about them.
> 
> The subject is the (new) RDF term << :a :b :c >>.
> 
> > I'm also
> > wondering how triple stores will handle this if the triple itself
> > doesn't exist
> 
> << :a :b :c >> is a new kind of Node in Jena (Node_Triple).
> 
> >  - will it simply be dropped after parsing the whole
> > document is done? Given that the triple could occur after the annotation
> > in a stream, this needs some more effort for triple stores, right?
> 
> Not in Jena the <<>> is a new RDF Term (node) and is a first-class 
> object in the system. It does not need triple ":a :b :c" to exist.
> 
> Annotations are not stored directly with the triple they annotate. There 
> is an indirection through the <<>> term.
> 
> > Also,
> > what happens if a SPARQL INSERT does add just the annotation? I guess
> > nothing, or will the annotation be kept nevertheless - I don't think so?
> > 
> > On the other hand, the annotation syntax will add both, the triple and
> > the annotation in a step - this is nice.
> 
> For our readers: this is annotation syntax:
> 
> :a :b :c {| :d "object" |}
> 
> it is syntax for two triples:
> 
> :a :b :c .
> << :a :b :c >> :d "object" .
> 
> Modelling in the data is used for complex use cases -
> Here is a larger example where we have two separate sources for a triple:
> 
> PREFIX :   <http://example/>
> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
> 
> :s :p :o {| :source [ :graph <http://host1/> ;
>:date "2020-01-20"^^xsd:date
>  ] ;
>  :source [ :graph <http://host2/> ;
>:date "2020-12-31"^^xsd:date
>  ]
>|} .
> 
> It is:
> 
> @prefix :  <http://example/> .
> @prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
> 
> :s  :p  :o .
> 
> << :s :p :o >>
>  :source  [ :date   "2020-12-31"^^xsd:date ;
> :graph  <http://host2/>
>   ] .
> << :s :p :o >>
>  :source  [ :date   "2020-01-20"^^xsd:date ;
> :graph  <http://host1/>
>   ] .
> 
> or (same triples)
> 
> << :s :p :o >>
>  :source  [ :date   "2020-12-31"^^xsd:date ;
> :graph  <http://host2/>
>   ] ;
>  :source  [ :date   "2020-01-20"^^xsd:date ;
> :graph  <http://host1/>
>   ] .
> 
> Like every use of "1"^^xsd:integer or <http://example/> is the same RDF 
> term (and unliek the []-syntax) , every use of <<:s :p :o>> is the same 
> term.
> 
>  Andy
> 
> > 
> >>
> >> Everything else is left untouched.
> >>
> >> ";" and "," are just syntactic sugar in Turtle.
> >>
> >> How the triples are written makes no difference - a graph is a set of
> >> triples.
> >>
> >> Syntax test suite:
> >>
> >> https://w3c.github.io/rdf-star/tests/turtle/syntax/manifest.html
> >>
> >>      Andy
>


Turtle* same term twice

2020-12-19 Thread Laura Morales
Is this

<< :a :b :c >> :d ; :e .

the equivalent of this?

<< :a :b :c >> :d .
<< :a :b :c >> :e .

Will Fuseki/Jena store them and treat them the same exact way?


Turtle* multiple terms at once

2020-12-17 Thread Laura Morales
All the examples online about RDF* use a one-triple term, like this (taken from 
Fuseki docs)

<< :john foaf:name "John Smith" >> dct:source  .

I wonder if there is any way in Turtle* to apply the same properties to 
multiple terms at once? Something like this

<< :john :name "John"; :age 20 >> dct:source ex:source .

Or do I have to write every term one by one? I couldn't find any documentation 
about this.


Re: Indexing datatypes

2020-12-16 Thread Laura Morales
> Andy can confirm, but AFAIK everything is indexed in Jena.
> And a number is stored as such .


Sorry how does/can it index this?

:alice :age 21; :age "twenty-one" .

By converting everything to strings?


Indexing datatypes

2020-12-16 Thread Laura Morales
I'm able to configure Fuseki for fulltext search with Lucene, but I don't see 
any other type of indexes in the documentation. If there is a property that is 
(ab)used to link multiple datatypes, say for example

Turtle
--
:alice :age 21; :age "twenty-one" .

How is this stored by Jena (everything as a string?)? Can I create an index for 
a specific datatype, say integer or datetime, and how does this work if the 
property is abused like in the example above?


Re: Does Jena support RDF*?

2020-12-14 Thread Laura Morales
> This is an alternative to RDF* , AFAIK .
> If someone is interested, I can document the structure of the secondary TDB
> database.

I don't want to abuse your time but yes, this would be helpful. Even just a 
sketch of it, just to get the idea.


Re: Does Jena support RDF*?

2020-12-13 Thread Laura Morales
> What's your interest in RDF*?

There seems to have been this endless debate about triplestores vs property 
graphs since as far as I can remember. This new standard apparently promises to 
be the best of both world by supporting RDF plus what they call "richer types" 
(aka nodes, vertexes). "Richer" compared to the extremely atomic level of 
triples. So my interest is mostly to try it and see how it compares. Also from 
a storage point of view since everything that I've read claims that property 
graphs are faster to traverse because their storage is not "index-based" like 
triples. I've personally tried to use a couple of property graphs databases but 
I keep going back to triplestores for the only reason that they use more 
standardized technology. Every property graph instead seems to have its own way 
of doing things; I couldn't even find a standardized format for 
exporting/importing graphs or a standardized query language (although there are 
some efforts toward one called GQL). So if RDF* can combine the best parts of 
both worlds, I want to try it :)
Please note that I'm not personally interested in the semantic web or the RDF 
artificial intelligence koolaid. I'm interested in the graph model with a great 
appreciation for free standards and simplicity. If RDF* can make the design of 
graphs simpler (ie. richer structures, fewer hacks and workarounds) then it's 
definitely something that I will use.
Another issue for me with property graphs, but I would like to hear your 
feedback on this, is that properties are indexed globally and it's my 
understanding that they only accept one data type (eg. Integer). So I'm not 
sure how indexing work over there from a storage point of view. I think they 
would require me to define 2 properties instead of one or some kind of 
namespace, let's say "ns1_age" and "ns2_age" where one property takes Integer 
and the other one String for example. Which, at the end of the day, is the same 
thing as using RDF prefixes.


Does Jena support RDF*?

2020-12-13 Thread Laura Morales
I've only recently discovered the existence of RDF* and Turtle*. Looks like 
they were introduced around 2019. Does Jena have support for these in the 
current release?


Re: Fuseki - Grant permissions to all by default

2020-01-03 Thread Laura Morales
> I was able to find the Shiro.ini in fuseki2 code, but not in fuseki1. Can
> you please point me to its location in earlier version?

The current release of Fuseki is 3.13; I doubt Fuseki 1 is still supported and 
maybe it didn't even have Shiro at all. If you start the server without a 
shiro.ini file, and if it does use Shiro, it should create a default file 
automatically (that you can edit later). But seriously, upgrade.


Re: Fuseki - Grant permissions to all by default

2019-12-14 Thread Laura Morales
I think the shiro.ini configuration is configured like that on purpose, you do 
not want anon access from every host by default. So I would guess that that's 
unlikely to change.
If you have only a few instances, just change the file for any of them or 
copy/paste the same file. If you have too many, either create links or maybe 
mount a shared folder.




> Sent: Sunday, December 15, 2019 at 7:47 AM
> From: "Amandeep Srivastava" 
> To: users@jena.apache.org
> Subject: Re: Fuseki - Grant permissions to all by default
>
> Yes, I'm talking about Shiro.ini file, apologies if the question isn't
> clear.
>
> I'm trying to deploy the fuseki server on multiple hosts, changing it
> manually would mean manually logging into each of the hosts to do it.
>
> Also, its not necessary that I launch fuseki from same directory everytime
> since I can have multiple instances of it running, that means a new run
> folder for every run and manually editing Shiro for each run too.
>
> On Sun, 15 Dec, 2019, 12:13 PM Laura Morales,  wrote:
>
> > I think you're talking about the shiro.ini file but I don't think the
> > question is very clear. What is the problem with editing one file manually?
> >
> >
> >
> >
> > > Sent: Sunday, December 15, 2019 at 7:32 AM
> > > From: "Amandeep Srivastava" 
> > > To: Users@jena.apache.org
> > > Subject: Fuseki - Grant permissions to all by default
> > >
> > > Hi,
> > >
> > > I'm trying to run an instance of fuseki server. This run generates a
> > folder
> > > called 'run', containing all fuseki config files.
> > >
> > > I have to manually comment /$/** = localhostFilter and add /$/**=anon to
> > > access and query the database from UI.
> > >
> > > Is there a way to set this to default?
> > >
> > > Thanks.
> > >
> >
>


Re: Fuseki - Grant permissions to all by default

2019-12-14 Thread Laura Morales
I think you're talking about the shiro.ini file but I don't think the question 
is very clear. What is the problem with editing one file manually?




> Sent: Sunday, December 15, 2019 at 7:32 AM
> From: "Amandeep Srivastava" 
> To: Users@jena.apache.org
> Subject: Fuseki - Grant permissions to all by default
>
> Hi,
>
> I'm trying to run an instance of fuseki server. This run generates a folder
> called 'run', containing all fuseki config files.
>
> I have to manually comment /$/** = localhostFilter and add /$/**=anon to
> access and query the database from UI.
>
> Is there a way to set this to default?
>
> Thanks.
>


Re: TDB optimization query

2019-11-12 Thread Laura Morales
This 
(http://mail-archives.apache.org/mod_mbox/jena-users/201712.mbox/%3CCAHM9nqQfOnyhBQj=jr-i9ieqhiv7vflfnleoycmhsupdd7n...@mail.gmail.com%3E)
 is an old thread, somewhat related to this question.




> Sent: Tuesday, November 12, 2019 at 1:29 PM
> From: "Amandeep Srivastava" 
> To: Users@jena.apache.org
> Subject: TDB optimization query
>
> Hi,
>
> I'm trying to create a TDB database from Wikidata's official RDF dump to
> read the data using Fuseki service. I need to make a few queries for my
> personal project, running which the online service times out.
>
> I have a 12 core machine with 36 GB memory.
>
> Can you please advise on the best way for creating the database? Since the
> dump is huge, I cannot try all the approaches. Besides, I'm not sure if the
> tdbloader function works in a similar way on data of different sizes.
>
> Questions:
>
> 1. Which one would be better to use - tdb.tdbloader2 (TDB1) or
> tdb2.tdbloader (TDB2) for creating the database and why? Any specific
> configurations that I should be aware of?
>
> 2. I'm running a job currently using tdb.tdbloader2 but it is using just a
> single core. Also, it's loading speed is decreasing slowly. It started at
> an avg of 120k tuples and is currently at 80k tuples. Can you advise how
> can I utilize all the cores of my machine and maintain the loading speed at
> the same time?
>
> Regards,
> Aman
>


Re: TDB optimization query

2019-11-12 Thread Laura Morales
tdb2.tdbloader has a --loader=parallel option, you could try with that.
From my past experience, the decreasing loading speed is caused by IO 
saturation. Do you have an HDD or an SSD?




> Sent: Tuesday, November 12, 2019 at 1:29 PM
> From: "Amandeep Srivastava" 
> To: Users@jena.apache.org
> Subject: TDB optimization query
>
> Hi,
>
> I'm trying to create a TDB database from Wikidata's official RDF dump to
> read the data using Fuseki service. I need to make a few queries for my
> personal project, running which the online service times out.
>
> I have a 12 core machine with 36 GB memory.
>
> Can you please advise on the best way for creating the database? Since the
> dump is huge, I cannot try all the approaches. Besides, I'm not sure if the
> tdbloader function works in a similar way on data of different sizes.
>
> Questions:
>
> 1. Which one would be better to use - tdb.tdbloader2 (TDB1) or
> tdb2.tdbloader (TDB2) for creating the database and why? Any specific
> configurations that I should be aware of?
>
> 2. I'm running a job currently using tdb.tdbloader2 but it is using just a
> single core. Also, it's loading speed is decreasing slowly. It started at
> an avg of 120k tuples and is currently at 80k tuples. Can you advise how
> can I utilize all the cores of my machine and maintain the loading speed at
> the same time?
>
> Regards,
> Aman
>


Re: No such type:

2019-10-06 Thread Laura Morales
$ cat run/config.ttl
@prefix :<#> .
@prefix fuseki:   .
@prefix rdf:  .
@prefix rdfs: .
@prefix tdb:  .
@prefix ja:   .
@prefix text: .

[] rdf:type fuseki:Server ;
ja:context [ ja:cxtName "arq:queryTimeout" ;  ja:cxtValue "3" ] .


--


$ cat run/configuration/demo.ttl
PREFIX :<#>
PREFIX fuseki:  
PREFIX ja:  
PREFIX rdf: 
PREFIX rdfs:
PREFIX tdb: 
PREFIX text:

:service a fuseki:Service ;
rdfs:label "demo" ;
fuseki:name "demo" ;
fuseki:serviceQuery "query" ;
fuseki:dataset :text_dataset .

:text_dataset a text:TextDataset ;
text:dataset :dataset ;
text:index   :dataset_index .

:dataset a a tdb:DatasetTDB ;
tdb:location "..." .

:dataset_index a text:TextIndexLucene ;
text:directory  ;
text:entityMap :index_map .

:index_map a text:EntityMap ;
text:entityField  "uri" ;
text:defaultField "field" ;
text:map ([
text:field "field" ;
text:predicate rdfs:label ]) .


--


$ ./fuseki-server --version
Jena:   VERSION: 3.12.0
Jena:   BUILD_DATE: 2019-05-27T16:07:27+
TDB:VERSION: 3.12.0
TDB:BUILD_DATE: 2019-05-27T16:07:27+
Fuseki: VERSION: 3.12.0
Fuseki: BUILD_DATE: 2019-05-27T16:07:27+


--

With the above configuration I do not get a "No such type" error, but I get a 
sort of mixed behavior. Sometimes it seems to work, while most of the times it 
returns zero results. And I cannot reproduce it either... it gives me zero 
results except a few times it magically starts working (returning results).
I get the "No such type" error when using a RDFDataset instead of DatasetTDB, 
but at this point I'd love to understand what's going on here before trying to 
understand the RDFDataset error.



> Sent: Sunday, October 06, 2019 at 8:10 PM
> From: "Chris Tomlinson" 
> To: users@jena.apache.org
> Subject: Re: No such type: 
>
> Hi Laura,
>
> It would be helpful to see the assembler file. Then we may get closer to 
> whether there's a bug.
>
> Regards,
> Chris


No such type:

2019-10-06 Thread Laura Morales
I'm trying to enable full text search on a Fuseki v3.12 instance but I get the 
error shown below. The assembler is pretty much a copycat of the documentation, 
with a Lucene text index. The assembler contains the prefix "text: 
".
Is this a bug?

$ java -cp fuseki-server.jar jena.textindexer --desc=run/config.ttl
org.apache.jena.sparql.ARQException: No such type: 

at 
org.apache.jena.sparql.core.assembler.AssemblerUtils.build(AssemblerUtils.java:134)
at 
org.apache.jena.query.text.TextDatasetFactory.create(TextDatasetFactory.java:38)
at jena.textindexer.processModulesAndArgs(textindexer.java:90)
at jena.cmd.CmdArgModule.process(CmdArgModule.java:52)
at jena.cmd.CmdMain.mainMethod(CmdMain.java:92)
at jena.cmd.CmdMain.mainRun(CmdMain.java:58)
at jena.cmd.CmdMain.mainRun(CmdMain.java:45)
at jena.textindexer.main(textindexer.java:52)



Re: [ANN] Apache Jena 3.13.0

2019-09-29 Thread Laura Morales
> JENA-1731: Fuseki endpoint configuration

Is updating 3.12 to 3.13 going to break server configuration? I wonder if there 
is a description of the new configuration?


Re: [ANN] Apache Jena 3.13.0

2019-09-29 Thread Laura Morales
Thank you devs for working on Fuseki.


> Sent: Sunday, September 29, 2019 at 10:56 AM
> From: "Andy Seaborne" 
> To: "users@jena.apache.org" 
> Subject: [ANN] Apache Jena 3.13.0
>
> The Apache Jena development community is pleased to
> announce the release of Apache Jena 3.13.0.
>
> This release includes a built-in SHACL engine for
> core and SPARQL constraints.
>
> https://jena.apache.org/documentation/shacl/
>
> == Major items
>
> JENA-1693: Add Aggregate Function MEDIAN and MODE (from Marco Neumann)
> JENA-1731: Fuseki endpoint configuration
> JENA-1695: DB storage refactoring
> JENA-1718: Remove jena-spatial from the build
> JENA-1760: Retire jena-maven-tools
> JENA-1733: SHACL engine
>
> == Other
>
> JIRA items https://s.apache.org/jena-3.13.0-jira
>
> == Upgrades to libraries
>
> FasterXML jackson:: 2.9.9 -> 2.9.10
>Various CVEs.
>
> jsonld-java :: 0.12.3 -> 0.12.5
>
> JENA-1754: Apache Commons Compress :: 1.18 -> 1.19 (Brad Hards)
>
> JENA-1756: Dependency updates.
> micrometer :: 1.1.3->1.2.1
> Apache Commons Lang3 :: 3.4->3.9
> Apache Commons CSV :: 1.5 -> 1.7
> Apache HttpClient :: 4.5.5 ->  4.5.10
> Apache Commons Collections4 :: 4.1 -> 4.4
>
> == Obtaining Apache Jena 3.13.0
>
> * Via central.maven.org
>
> The main jars and their dependencies can used with:
>
>
>  org.apache.jena
>  apache-jena-libs
>  pom
>  3.13.0
>
>
> Full details of all maven artifacts are described at:
>
>  http://jena.apache.org/download/maven.html
>
> * As binary downloads
>
> Apache Jena libraries are available as a binary distribution of
> libraries. For details of a global mirror copy of Jena binaries please see:
>
> http://jena.apache.org/download/
>
> * Source code for the release
>
> The signed source code of this release is available at:
>
> http://www.apache.org/dist/jena/source/
>
> and the signed master source for all Apache Jena releases is available
> at: http://archive.apache.org/dist/jena/
>
> == Contributing
>
> If you would like to help out, a good place to look is the list of
> unresolved JIRA at:
>
> http://s.apache.org/jena-jira-current
>
> or review pull requests at
>
> https://github.com/apache/jena/pulls
>
> or drop into the dev@ list.
>
> We use github pull requests and other ways for accepting code:
>   https://github.com/apache/jena/blob/master/CONTRIBUTING.md
>


Fuseki vs Rya

2019-09-24 Thread Laura Morales
Now that Rya has been promoted to top-level project, I'd like to hear your 
comments about Fuseki vs Rya. Pros of both, when and why I should use one 
or the other. Thanks!


Re: Riot warning for Unicode NFC IRIs

2019-09-20 Thread Laura Morales
> Both Wikipedia and DBpedia can not handle NFKC: 
> https://en.wikipedia.org/wiki/Ranma_1⁄2

The wikipedia link is https://en.wikipedia.org/wiki/Ranma_%C2%BD


Re: Riot warning for Unicode NFC IRIs

2019-09-20 Thread Laura Morales
Valid URIs are basically ASCII strings. I don't think you can stick any other 
character in there.


> Sent: Friday, September 20, 2019 at 9:11 AM
> From: "Sebastian Hellmann" 
> To: users@jena.apache.org
> Subject: Riot warning for Unicode NFC IRIs
>
> Hi all,
> 
> I really don't understand these RIOT warnings:
> 
> 19/09/20 08:43:00 WARN riot: [line: 229, col: 1 ] Bad IRI: 
>  Code: 47/NOT_NFKC in PATH: The IRI 
> is not in Unicode Normal Form KC.
> 19/09/20 08:43:00 WARN riot: [line: 229, col: 1 ] Bad IRI: 
>  Code: 56/COMPATIBILITY_CHARACTER 
> in PATH: Bad character
> 
> for ½ NFC == NFD as well as NFKD == NFKC
> 
> Tested here: https://minaret.info/test/normalize.msp
> 
> Result string:½
> Result in hex:bd
> 
> Result string:1⁄2
> Result in hex:31 2044 32
> 
> Both Wikipedia and DBpedia can not handle NFKC: 
> https://en.wikipedia.org/wiki/Ranma_1⁄2
> 
> Also the RDF Spec says to normalize to NFC: 
> https://www.w3.org/TR/rdf11-concepts/#section-IRIs
> 
> 
> We are using these versions:
> 
> +- org.apache.jena:jena-core:jar:3.7.0:compile
> [INFO] |  |  \- org.apache.jena:jena-base:jar:3.7.0:compile
> 
> Concretely I have these questions:
> 
> - can you check whether this is a bug or point me to the issue tracker, 
> where this was or should be recorded.
> 
> - How can I disable specifically these two warning messages, either by 
> configuration or by tuning log4j?
> 
> 
> -- 
> All the best,
> Sebastian Hellmann
> 
> Director of Knowledge Integration and Linked Data Technologies (KILT) 
> Competence Center
> at the Institute for Applied Informatics (InfAI) at Leipzig University
> Executive Director of the DBpedia Association
> Projects: http://dbpedia.org, http://nlp2rdf.org, 
> http://linguistics.okfn.org, https://www.w3.org/community/ld4lt 
> 
> Homepage: http://aksw.org/SebastianHellmann
> Research Group: http://aksw.org
>


Re: "not in a transaction", but only sometimes

2019-09-12 Thread Laura Morales
> >  <#dataset> rdf:type tdb:DatasetTDB ;
> >  tdb:location "DB" ;
> >  tdb:unionDefaultGraph true ;
> >  .
>
> The dataset can have a ja:context setting and that can be the one for
> query union graph.  Not "?default-graph-uri=" which is protocol and per
> query call.


Is there an equivalent of "tdb:unionDefaultGraph true" for a RDFDataset? I'm 
thinking of a property that can be set in the assembler file instead of using 
"?default-graph-uri=urn:x-arq:UnionGraph"? Perhaps something like

:dataset a ja:RDFDataset ;
# Use the union (of the named graphs) as default graph
ja:unionDefaultGraph true ;

ja:namedGraph
   [ ja:graphName   ;
 ja:graph  <#model1> ] ;
ja:namedGraph
   [ ja:graphName   ;
 ja:graph  <#model2> ] ;
...


Re: "not in a transaction", but only sometimes

2019-09-12 Thread Laura Morales
> :dataset a ja:RDFDataset ;
>ja:namedGraph [ ja:graphName "http;//example/name" ;
>ja:graph :graph1 ] ;
>ja:namedGraph ...


a quick test of this one with ?default-graph-uri=urn:x-arq:UnionGraph on Fuseki 
3.12.0 seems to run just fine (actually, it seems to respond even quicker than 
the ja:UnionModel configuration).
Could you please explain in just a few words the difference between using 
?default-graph-uri=urn:x-arq:UnionGraph and ja:UnionModel? I'm having a hard 
time wrapping my head around this.


> set union mode on the dataset or (3.13.0) the
> service for default to query all graphs.


my previous question about this one also stands, is there a new feature being 
introduced in v3.13.0 regarding the union graph?

Thank you Andy for all the help.



Re: "not in a transaction", but only sometimes

2019-09-11 Thread Laura Morales
> set union mode on the dataset

I mean you're talking about this

<#dataset> rdf:type tdb:DatasetTDB ;
tdb:location "DB" ;
tdb:unionDefaultGraph true ;
.

> or (3.13.0) the service for default to query all graphs.

is this referring to "?default-graph-uri=" or does it refer to other new 
features in Fuseki 3.13.0 (perhaps something like "tdb:unionDefaultGraph" but 
for RDFDataset instead)?



Re: "not in a transaction", but only sometimes

2019-09-11 Thread Laura Morales
> Presumably this is more than one TDB database.

yes, a few 10s

> What works is when graphs are in the same database.

this is not easy for me to switch to. My problem is that I'm pulling multiple 
sources (RDF) from different places, and I want all of them together to be my 
default graph. If I create a single dataset, every time I update one source I 
have to re-upload the whole thing even if the source that I've updated is only 
a few MB. Graphs are also updated frequently. With separate datasets, I can 
download/update only one dataset and upload much less data.

> What might be better is define the ja:RDFDataset to have the graphs,
> then query the union graph of the dataset. But that again is crossing
> multiple transaction systems so may not work.

Right now I have a RDFDataset with a UnionModel like this

:dataset a ja:RDFDataset ;
ja:defaultGraph :union ;
.

:union a ja:UnionModel ;
ja:subModel :graph1 ;
ja:subModel :graph2 ;
ja:subModel :graph3 ;
.

how wouldl I "give" the graphs to the RDFDataset directly?
Or is there maybe a different way to address my problem (updating graphs 
independently so that I don't have to upload the whole lot every time)?


Re: "not in a transaction", but only sometimes

2019-09-11 Thread Laura Morales
For what it's worth, 3.7.0 does seem to work as well. On the other hand, 3.8.0, 
3.9.0, 3.10.0, 3.11.0, 3.12.0, none of them work, they return the same error. 
It looks like as if something was introduced between 3.7.0 and 3.8.0 that broke 
the UnionModel and transactions.

Could somebody please suggest to me how to setup a working UnionModel with 
Fuseki 3.12.0? Thanks!




> Sent: Wednesday, September 11, 2019 at 10:46 AM
> From: "Laura Morales" 
> To: users@jena.apache.org
> Cc: users@jena.apache.org
> Subject: Re: "not in a transaction", but only sometimes
>
> > Forgot to mention: Fuseki 3.12
>
> I fear guys that you've introduced a new regression in 3.12.0 regarding the 
> UnionModel/transactions/TDB. I've spent hours debugging my code and Fuseki 
> configuration, until I noticed that my dev server was 3.6.0 but in the prod 
> server I had installed 3.12.0 (the last version). Reverting to 3.6.0 works 
> without problems. Reverting once more to 3.12.0, the problem is there again.
>
> To give some context, I have a very basic UnionModel as described in the 
> previous email. What I need is a union graph of graphs extracted from several 
> TDB datasets (I would like to use TDB2 but it does not support this feature, 
> so all my datasets are TDB). I'm running a DESCRIBE query that extracts data 
> from the union, data that is stored in 2 graphs. With 3.6.0 it works. With 
> 3.12.0 I receive a 500 and Fuseki's logs show me "Not in a transaction". This 
> query however does work sometimes, in particular if I run it alone. I see the 
> problem when I run it after other queries in sequence. No data is ever 
> altered, they are all read-only queries (ASK/DESCRIBE). I really don't know 
> what's going on.
>


Re: "not in a transaction", but only sometimes

2019-09-11 Thread Laura Morales
> Forgot to mention: Fuseki 3.12

I fear guys that you've introduced a new regression in 3.12.0 regarding the 
UnionModel/transactions/TDB. I've spent hours debugging my code and Fuseki 
configuration, until I noticed that my dev server was 3.6.0 but in the prod 
server I had installed 3.12.0 (the last version). Reverting to 3.6.0 works 
without problems. Reverting once more to 3.12.0, the problem is there again.

To give some context, I have a very basic UnionModel as described in the 
previous email. What I need is a union graph of graphs extracted from several 
TDB datasets (I would like to use TDB2 but it does not support this feature, so 
all my datasets are TDB). I'm running a DESCRIBE query that extracts data from 
the union, data that is stored in 2 graphs. With 3.6.0 it works. With 3.12.0 I 
receive a 500 and Fuseki's logs show me "Not in a transaction". This query 
however does work sometimes, in particular if I run it alone. I see the problem 
when I run it after other queries in sequence. No data is ever altered, they 
are all read-only queries (ASK/DESCRIBE). I really don't know what's going on.


Re: "not in a transaction", but only sometimes

2019-09-10 Thread Laura Morales
Forgot to mention: Fuseki 3.12, and this is the error


[2019-09-10 17:02:32] QueryIteratorCheck WARN  Open iterator: TripleMapper/255
[2019-09-10 17:02:32] Fuseki WARN  [7] RC = 500 : Not in a transaction
org.apache.jena.tdb.transaction.TDBTransactionException: Not in a transaction
at 
org.apache.jena.tdb.transaction.DatasetGraphTransaction.get(DatasetGraphTransaction.java:140)
at 
org.apache.jena.tdb.transaction.DatasetGraphTransaction.get(DatasetGraphTransaction.java:52)
at 
org.apache.jena.sparql.core.DatasetGraphWrapper.getR(DatasetGraphWrapper.java:80)
at 
org.apache.jena.sparql.core.DatasetGraphWrapper.find(DatasetGraphWrapper.java:181)
at 
org.apache.jena.sparql.core.GraphView.graphBaseFind(GraphView.java:121)
at 
org.apache.jena.sparql.core.GraphView.graphBaseFind(GraphView.java:113)
at org.apache.jena.graph.impl.GraphBase.find(GraphBase.java:241)
at 
org.apache.jena.graph.compose.MultiUnion.multiGraphFind(MultiUnion.java:170)
at 
org.apache.jena.graph.compose.MultiUnion.graphBaseFind(MultiUnion.java:147)
at org.apache.jena.graph.impl.GraphBase.find(GraphBase.java:241)
at 
org.apache.jena.graph.impl.GraphBase.graphBaseFind(GraphBase.java:258)
at org.apache.jena.graph.impl.GraphBase.find(GraphBase.java:255)
at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.(QueryIterTriplePattern.java:75)
at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern.nextStage(QueryIterTriplePattern.java:49)
at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:108)
at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:65)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:101)
at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:65)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at 
org.apache.jena.sparql.engine.iterator.QueryIterBlockTriples.hasNextBinding(QueryIterBlockTriples.java:63)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at 
org.apache.jena.sparql.engine.iterator.QueryIterConvert.hasNextBinding(QueryIterConvert.java:58)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:39)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:39)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at 
org.apache.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:74)
at 
org.apache.jena.sparql.engine.QueryExecutionBase.execDescribe(QueryExecutionBase.java:299)
at 
org.apache.jena.sparql.engine.QueryExecutionBase.execDescribe(QueryExecutionBase.java:278)
at 
org.apache.jena.fuseki.servlets.SPARQL_Query.executeQuery(SPARQL_Query.java:358)
at 
org.apache.jena.fuseki.servlets.SPARQL_Query.execute(SPARQL_Query.java:290)
at 
org.apache.jena.fuseki.servlets.SPARQL_Query.executeWithParameter(SPARQL_Query.java:239)
at 
org.apache.jena.fuseki.servlets.SPARQL_Query.perform(SPARQL_Query.java:224)
at 
org.apache.jena.fuseki.servlets.ActionService.executeLifecycle(ActionService.java:266)
at 
org.apache.jena.fuseki.servlets.ActionService.execCommonWorker(ActionService.java:155)
at 
org.apache.jena.fuseki.servlets.ActionBase.doCommon(ActionBase.java:74)
at 
org.apache.jena.fuseki.servlets.FusekiFilter.doFilter(FusekiFilter.java:73)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
at 
org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:61)
at 
org.apache.shiro.web.servlet.AdviceFilter.executeChain(AdviceFilter.java:108)
at 
org.apache.shiro.web.servlet.AdviceFilter.doFilterInternal(AdviceFilter.java:137)
at 
org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125)
at 
org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:66)
at 
org.apache.shiro.web.servlet.AbstractShiroFilter.executeChain(AbstractShiroFilter.java:449)
at 
org.apache.shiro.web.servlet.AbstractShiroFilter$1.call(AbstractShiroFilter.java:365)
at 

"not in a transaction", but only sometimes

2019-09-10 Thread Laura Morales
Just out of curiosity, has someone else here experienced a query that returns 
"not in a transaction" only sometimes?
I have a simple dataset like this


:dataset a ja:RDFDataset ;
ja:defaultGraph :union

:union a ja:UnionModel ;
ja:subModel :graph1 #TDB
ja:subModel :graph2 #TDB
...


I have a DESCRIBE query that works. However when I submit it from my program 
after other SELECT/ASK queries in sequence, it stops working. I see this error 
"Not in a transaction", and I have to restart Fuseki. All my databases in the 
union are TDB1, not TDB2.


Re: % character in mailto: URI

2019-08-31 Thread Laura Morales
> Is  not a legal mailto: URI? Or does it
> need to be encoded somehow?

That is not a legal URI, strictly speaking. Look up 
https://en.wikipedia.org/wiki/Percent-encoding
I've never in my life seen an email address with a percent sign, but if you 
*must* use it then you have to encode it, it becomes something like 



Re: Long URIs

2019-08-28 Thread Laura Morales
Everything else being equal, no. Not in any noticeable way.


> Sent: Wednesday, August 28, 2019 at 5:37 PM
> From: "Piotr Nowara" 
> To: users@jena.apache.org
> Subject: Long URIs
>
> Hi,
>
> does in your experience using very long URIs (like more than 100
> characters) affect SPARQL performance?
>
> Thanks,
> Piotr
>


Re: RE: Sensible size limit for SPARQL update payload to Fuseki2?

2019-08-07 Thread Laura Morales
> from SPARQLWrapper import SPARQLWrapper
>
> # a lot longer
> myString = "INSERT DATA {}"
>
> def insertFromString(url, sparql):
> endpoint = SPARQLWrapper(url)
> endpoint.setQuery(sparql)
> endpoint.method = 'POST'
> endpoint.query()
>
> insertFromString('http://localhost:3030/myDS/update', myString)


can you change the html headers and see if it works? Something like this:

endpoint.addCustomHttpHeader("Content-Type", "application/sparql-update")



Re: Sensible size limit for SPARQL update payload to Fuseki2?

2019-08-07 Thread Laura Morales
Basically, this request should work?

POST /database HTTP/1.1
Host: example.com
Content-Type: application/sparql-update

INSERT DATA { < 100 MB of triples > }




> Sent: Wednesday, August 07, 2019 at 10:44 AM
> From: "Andy Seaborne" 
> To: users@jena.apache.org
> Subject: Re: Sensible size limit for SPARQL update payload to Fuseki2?
>
> Pierre,
> 
> RDFLib/SPARQLWrapper is using an HTML form upload.
> 
> The scalable way is to POST with "Content-type: 
> application/sparql-update" and the INSERT in the body, then it will 
> stream - directly reading the update from the HTTP input stream with no 
> HTML Form (Request.extractFormParameters) on the execution path.
> 
> Fro an HTML form, the entire requests ends up in memory - its the way 
> that HTML forms have to handled to see all the name=value pairs in the 
> form. Incidentally, the same is true in the client.
> 
> The default form size is already bumped up to 10M from the Jetty default 
> of 200K.
> 
> If the server is running in verbose mode, the entire SPARQL update is 
> read in for logging/debugging purposes.
> 
> The default jetty configuration is in code. For the form size, that is 
> JettyFusekiWebapp.createWebApp which is 10M - we can make that default 
> bigger but not 101M which is the request.
> 
> Otherwise, break the request into parts and send multiple requests.
> 
>  Andy
> 
> On 07/08/2019 08:49, Pierre Grenon wrote:
> > Thank you, Lorenz.
> > 
> > I did as you suggest and made the changes indicated.
> > 
> > Fuseki started and seems to have accepted the jetty config. But then when 
> > trying to send the update the same error occurs and the limit seems 
> > unmodified (I used2).
> > 
> > Caused by: java.lang.IllegalStateException: Form too large: 100948991 > 
> > 1000
> >  at 
> > org.eclipse.jetty.server.Request.extractFormParameters(Request.java:545)
> >  at 
> > org.eclipse.jetty.server.Request.extractContentParameters(Request.java:475)
> >  at org.eclipse.jetty.server.Request.getParameters(Request.java:386)
> >  ... 50 more
> > 
> > Can it be that the config does not override some default set elsewhere in 
> > Fuseki?
> > 
> > I’ll try to figure if I’m not doing something else wrong…
> > 
> > Many thanks,
> > Pierre
> > 
> > For reference:
> > https://www.eclipse.org/jetty/documentation/current/configuring-form-size.html
>


Re: RE: Sensible size limit for SPARQL update payload to Fuseki2?

2019-08-06 Thread Laura Morales
I don't know myself how to configure jetty, even less with Fuseki. I hope for 
you somebody else on the list does.
Regarding LOAD however, you should be able to use HTTP URIs.


> Sent: Tuesday, August 06, 2019 at 7:01 PM
> From: "Pierre Grenon" 
> To: "'users@jena.apache.org'" 
> Subject: RE: Sensible size limit for SPARQL update payload to Fuseki2?
>
> Ok, so apologies for kinda spamming the list with this.
>
> 1. Laura, I agree with the options you listed below. Although:
> - LOAD, in my experience, requires access to the file system where fuseki is 
> running and I do not have that
> - SOH has that same requirement and also requires me to ssh into the machine, 
> which I don't want to have to do programmatically
> - chunking is my likely work around although it is suboptimal (I serialize an 
> RDFLib in memory graph)
>
> 2. After looking around and doing a bit of archaeology,
>
> https://jena.markmail.org/message/nmtny6wlnvzltws7?q=maxFormContentSize
> (At first seeing this I thought it used to be called 'Fuseky'! I can only 
> recall Joseki)
>
> it seems that the principled approach is to run fuseki with a customised 
> jetty configuration.
>
> http://jena.apache.org/documentation/fuseki2/data-access-control#jetty-configuration
> " Server command line: --jetty=jetty.xml."
> -> is wrong
>
> This:
> > fuseki-server --jetty-config=jetty.xml
> Worked for me.
>
> However, I do not know what to put in jetty.xml
>
> https://www.eclipse.org/jetty/documentation/current/setting-form-size.html
>
> I tried the following snippet but it broke
>
> 
>  "http://www.eclipse.org/jetty/configure_9_3.dtd;>
>
> 
>   
> org.eclipse.jetty.server.Request.maxFormContentSize
> 
>   
>
> [2019-08-06 17:07:48] Server ERROR SPARQLServer: Failed to configure 
> server: 0
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.jena.fuseki.cmd.JettyFusekiWebapp.configServer(JettyFusekiWebapp.java:297)
> at 
> org.apache.jena.fuseki.cmd.JettyFusekiWebapp.buildServerWebapp(JettyFusekiWebapp.java:243)
> at 
> org.apache.jena.fuseki.cmd.JettyFusekiWebapp.(JettyFusekiWebapp.java:99)
> at 
> org.apache.jena.fuseki.cmd.JettyFusekiWebapp.initializeServer(JettyFusekiWebapp.java:94)
> at org.apache.jena.fuseki.cmd.FusekiCmd.runFuseki(FusekiCmd.java:371)
> at 
> org.apache.jena.fuseki.cmd.FusekiCmd$FusekiCmdInner.exec(FusekiCmd.java:356)
> at jena.cmd.CmdMain.mainMethod(CmdMain.java:93)
> at jena.cmd.CmdMain.mainRun(CmdMain.java:58)
> at jena.cmd.CmdMain.mainRun(CmdMain.java:45)
> at 
> org.apache.jena.fuseki.cmd.FusekiCmd$FusekiCmdInner.innerMain(FusekiCmd.java:104)
> at org.apache.jena.fuseki.cmd.FusekiCmd.main(FusekiCmd.java:67)
>
> So I suppose I need a complete jetty config file rather than a snippet 
> (unless the above is erroneous anyway). I wasn't able to find the default 
> jetty configuration file in the jars.
>
> I found this 
> https://github.com/apache/jena/blob/master/jena-fuseki2/examples/fuseki-jetty-https.xml
> But it mentions needing configuring further things and I have no clue how to 
> adapt it.
>
> Any pointer, walkthrough or further help most appreciated.
>
> With many thanks and kind regards,
> Pierre


Re: Sensible size limit for SPARQL update payload to Fuseki2?

2019-08-06 Thread Laura Morales
Your best option is to look at the Fuseki logs for the exact error. I've 
personally never POSTed so much data to Fuseki, but I feel like it should not 
be be a problem unless something is timing out the connection, or truncating 
the POST data, or your triples contain syntax errors. Another option is to try 
the LOAD operation (https://www.w3.org/TR/sparql11-update/#load), or the SOH 
command line tools (https://jena.apache.org/documentation/fuseki2/soh.html). 
What I would do, personally speaking, is find a way to chunk your data and send 
multiple requests (even if, as I said, 85MB should work. It's not a huge file 
after all).


> Sent: Tuesday, August 06, 2019 at 12:39 PM
> From: "Pierre Grenon" 
> To: "'users@jena.apache.org'" 
> Subject: RE: RE: Sensible size limit for SPARQL update payload to Fuseki2?
>
> > Can you share your query?
>
> Afraid I can't
>
> It looks like :
>
> 
>
> INSERT DATA {
>
> <9000K triples>
>
> }
>
> > I don't understand if you're trying to insert a single literal string that 
> > is 85MB in size, or if you're trying to load 900K triples that are 85MB in 
> > total.
>
> Second one. I'm not trying to load a single triple with a 85Mb object literal 
> but I am trying to perform a single INSERT operation of 900k triples.
>
> Many thanks,
> Pierre


Re: RE: Sensible size limit for SPARQL update payload to Fuseki2?

2019-08-06 Thread Laura Morales
Can you share your query? I don't understand if you're trying to insert a 
single literal string that is 85MB in size, or if you're trying to load 900K 
triples that are 85MB in total.



> Sent: Tuesday, August 06, 2019 at 9:52 AM
> From: "Pierre Grenon" 
> To: "users@jena.apache.org" 
> Subject: RE: Sensible size limit for SPARQL update payload to Fuseki2?
>
> Quick follow up --
>
> The web fuseki interface is happy loading the turtle equivalent. Takes more 
> time (~10-15 sec) than it takes for the programmatic way to return an error 
> (<2 sec), it's about 900k triples.
>
> So I was thinking of splitting my INSERT and was wondering if there is a 
> reasonable chunk size.
>
> As said, will try to look at the logs anyway.
>
> Best,
> Pierre
>
> THIS E-MAIL MAY CONTAIN CONFIDENTIAL AND/OR PRIVILEGED INFORMATION.
> IF YOU ARE NOT THE INTENDED RECIPIENT (OR HAVE RECEIVED THIS E-MAIL
> IN ERROR) PLEASE NOTIFY THE SENDER IMMEDIATELY AND DESTROY THIS
> E-MAIL. ANY UNAUTHORISED COPYING, DISCLOSURE OR DISTRIBUTION OF THE
> MATERIAL IN THIS E-MAIL IS STRICTLY FORBIDDEN.
>
> IN ACCORDANCE WITH MIFID II RULES ON INDUCEMENTS, THE FIRM'S EMPLOYEES
> MAY ATTEND CORPORATE ACCESS EVENTS (DEFINED IN THE FCA HANDBOOK AS
> "THE SERVICE OF ARRANGING OR BRINGING ABOUT CONTACT BETWEEN AN INVESTMENT
> MANAGER AND AN ISSUER OR POTENTIAL ISSUER"). DURING SUCH MEETINGS, THE
> FIRM'S EMPLOYEES MAY ON NO ACCOUNT BE IN RECEIPT OF INSIDE INFORMATION
> (AS DESCRIBED IN ARTICLE 7 OF THE MARKET ABUSE REGULATION (EU) NO 596/2014).
> (https://www.handbook.fca.org.uk/handbook/glossary/G3532m.html)
> COMPANIES WHO DISCLOSE INSIDE INFORMATION ARE IN BREACH OF REGULATION
> AND MUST IMMEDIATELY AND CLEARLY NOTIFY ALL ATTENDEES. FOR INFORMATION
> ON THE FIRM'S POLICY IN RELATION TO ITS PARTICIPATION IN MARKET SOUNDINGS,
> PLEASE SEE https://www.horizon-asset.co.uk/market-soundings/.
>
> HORIZON ASSET LLP IS AUTHORISED AND REGULATED
> BY THE FINANCIAL CONDUCT AUTHORITY.
>
>
> > -Original Message-
> > From: Pierre Grenon
> > Sent: 06 August 2019 08:47
> > To: 'users@jena.apache.org'
> > Subject: RE: Sensible size limit for SPARQL update payload to Fuseki2?
> >
> > Hi,
> >
> > Thanks for your answer.
> >
> > I'll look into the server's logs if I can. I am using this approach because 
> > the
> > server is in fact remote. So I read into a string that I then pass to POST. 
> > The
> > query when saved to file is ~ 85Mb.
> >
> > I guess I was lazy and hoping for an easy answer if this clogged up Fuseki 
> > in
> > known ways and whether that might be addressed through config (memory,
> > thread, whatnots).
> >
> > Will double check a few things and report then.
> >
> > With many thanks and best regards,
> > Pierre
> >
> >
> > From: Laura Morales [mailto:laure...@mail.com]
> > Sent: 06 August 2019 08:19
> > To: users@jena.apache.org
> > Cc: users@jena.apache.org
> > Subject: Re: Sensible size limit for SPARQL update payload to Fuseki2?
> >
> > How long is your query?? Personally I'm not aware of any such limitations,
> > especially when POSTing, but other people here definitely know better than
> > me if there is one or not. If there is a limit, let's say even just 1MB, 
> > you need
> > a *very* long query to break it. 500 is a general "internal error" code, 
> > it's not
> > specific to query length. It could be something else (look at the logs). 
> > Did you
> > try to run the query from the Fuseki web interface, or even with another
> > database entirely? It could help you debug it.
> >
> >
> >
> > > Sent: Tuesday, August 06, 2019 at 8:15 AM
> > > From: "Pierre Grenon" 
> > > To: "users@jena.apache.org" 
> > > Subject: Sensible size limit for SPARQL update payload to Fuseki2?
> > >
> > > Hi,
> > >
> > > Maybe a long shot but thought I'd ask.
> > >
> > > I'm sending updates to a Fuseki2 from an RDFLib/SPARQLWrapper based
> > client. This POSTs an INSERT string to an update endpoint. I get back an 
> > error
> > 500 for strings over a certain large size which (the limit) I haven't tried 
> > to
> > figure out.
> > >
> > > Is there a theoretical, or other, reason why the limit exists and a 
> > > strategy
> > to adopt besides fine tuning the string size?
> > >
> > > Many thanks and best regards,
> > > Pierre
> > >
> > > THIS E-MAIL MAY CONTAIN CONFIDENTIAL AND/OR PRIVILEGED
> > INFOR

Re: Sensible size limit for SPARQL update payload to Fuseki2?

2019-08-06 Thread Laura Morales
How long is your query?? Personally I'm not aware of any such limitations, 
especially when POSTing, but other people here definitely know better than me 
if there is one or not. If there is a limit, let's say even just 1MB, you need 
a *very* long query to break it. 500 is a general "internal error" code, it's 
not specific to query length. It could be something else (look at the logs). 
Did you try to run the query from the Fuseki web interface, or even with 
another database entirely? It could help you debug it.



> Sent: Tuesday, August 06, 2019 at 8:15 AM
> From: "Pierre Grenon" 
> To: "users@jena.apache.org" 
> Subject: Sensible size limit for SPARQL update payload to Fuseki2?
>
> Hi,
>
> Maybe a long shot but thought I'd ask.
>
> I'm sending updates to a Fuseki2 from an RDFLib/SPARQLWrapper based client. 
> This POSTs an INSERT string to an update endpoint. I get back an error 500 
> for strings over a certain large size which (the limit) I haven't tried to 
> figure out.
>
> Is there a theoretical, or other, reason why the limit exists and a strategy 
> to adopt besides fine tuning the string size?
>
> Many thanks and best regards,
> Pierre
>
> THIS E-MAIL MAY CONTAIN CONFIDENTIAL AND/OR PRIVILEGED INFORMATION.
> IF YOU ARE NOT THE INTENDED RECIPIENT (OR HAVE RECEIVED THIS E-MAIL
> IN ERROR) PLEASE NOTIFY THE SENDER IMMEDIATELY AND DESTROY THIS
> E-MAIL. ANY UNAUTHORISED COPYING, DISCLOSURE OR DISTRIBUTION OF THE
> MATERIAL IN THIS E-MAIL IS STRICTLY FORBIDDEN.
>
> IN ACCORDANCE WITH MIFID II RULES ON INDUCEMENTS, THE FIRM'S EMPLOYEES
> MAY ATTEND CORPORATE ACCESS EVENTS (DEFINED IN THE FCA HANDBOOK AS
> "THE SERVICE OF ARRANGING OR BRINGING ABOUT CONTACT BETWEEN AN INVESTMENT
> MANAGER AND AN ISSUER OR POTENTIAL ISSUER"). DURING SUCH MEETINGS, THE
> FIRM'S EMPLOYEES MAY ON NO ACCOUNT BE IN RECEIPT OF INSIDE INFORMATION
> (AS DESCRIBED IN ARTICLE 7 OF THE MARKET ABUSE REGULATION (EU) NO 596/2014).
> (https://www.handbook.fca.org.uk/handbook/glossary/G3532m.html)
> COMPANIES WHO DISCLOSE INSIDE INFORMATION ARE IN BREACH OF REGULATION
> AND MUST IMMEDIATELY AND CLEARLY NOTIFY ALL ATTENDEES. FOR INFORMATION
> ON THE FIRM'S POLICY IN RELATION TO ITS PARTICIPATION IN MARKET SOUNDINGS,
> PLEASE SEE https://www.horizon-asset.co.uk/market-soundings/.
>
> HORIZON ASSET LLP IS AUTHORISED AND REGULATED
> BY THE FINANCIAL CONDUCT AUTHORITY.
>
>
>


Re: possible to run a TDB2 DB with a fuseki GUI and still have commandline query access?

2019-07-23 Thread Laura Morales
You can use any number of clients to query Fuseki, using the SPARQL Protocol 
(https://www.w3.org/TR/sparql11-protocol/) or the SPARQL Graph Store HTTP 
Protocol (https://www.w3.org/TR/sparql11-http-rdf-update/). Under fuseki/bin 
there are also some CLI tools called SOH 
(https://jena.apache.org/documentation/fuseki2/soh.html) for querying Fuseki.

With the TDB CLI tools you manage the database directory directly whereas with 
Fuseki you access them indirectly, through Fuseki. In terms of querying I don't 
think there's much difference. The main difference is that with the CLI 
programs you can do more administration tasks such as "tdbcompact" for 
compacting a database, "tdbloader" for creating a new database from a file of 
triples, or "tdbquery" has an option "--explain" that I don't think you can 
trigger with Fuseki. I don't think Fuseki has any UI options for this kind of 
maintenance tasks.



> Sent: Wednesday, July 24, 2019 at 5:55 AM
> From: "Jeff Lerman" 
> To: users@jena.apache.org
> Subject: Re: possible to run a TDB2 DB with a fuseki GUI and still have 
> commandline query access?
>
> Hmm. What about submitting queries against the SPARQL endpoint, using a
> different client? Would that work? If I do that, would there be any loss of
> functionality compared to what I could get by using the tdbquery tool?
> 
> --Jeff
> 
> On Tue, Jul 23, 2019, 8:15 PM Laura Morales  wrote:
> 
> > As far as I know it's not possible to run the CLI commands on a live
> > dataset. You have to stop Fuseki first.
> >
> >
> > > Sent: Wednesday, July 24, 2019 at 4:28 AM
> > > From: "Jeff Lerman" 
> > > To: users@jena.apache.org
> > > Subject: possible to run a TDB2 DB with a fuseki GUI and still have
> > commandline query access?
> > >
> > > Is there a recommended technique (or a HOWTO doc) to run a TDB2 DB
> > managed
> > > via the Feski web console, and still allows commandline querying?
> > >
> > > I’ve realized that one can’t simply start up Fuseki, populate a TDB2 DB
> > > with it, and then point the commandline tool tdb2.tdbquery at that DB -
> > > that results in "org.apache.jena.dboe.DBOpEnvException: Failed to get a
> > > lock: …”
> > >
> > > Any guidance would be much appreciated; the docs at jena.apache.org
> > don’t
> > > seem to directly address this use-case, but I’m hoping I’m
> > > missing something.
> > >
> > > Thanks!
> > >
> > >
> > > [image: email_sig_logo_vert.png]
> > >
> > > Jeff Lerman
> > >
> > > AI Scientist
> > >
> > > Mobile: 510-495-4621
> > >
> > > www.invitae.com
> > >
> > > [image: email_sig_social_linkedin.png]
> > > <https://www.linkedin.com/in/jefflerman/>
> > >
> >
>


Re: possible to run a TDB2 DB with a fuseki GUI and still have commandline query access?

2019-07-23 Thread Laura Morales
As far as I know it's not possible to run the CLI commands on a live dataset. 
You have to stop Fuseki first.


> Sent: Wednesday, July 24, 2019 at 4:28 AM
> From: "Jeff Lerman" 
> To: users@jena.apache.org
> Subject: possible to run a TDB2 DB with a fuseki GUI and still have 
> commandline query access?
>
> Is there a recommended technique (or a HOWTO doc) to run a TDB2 DB managed
> via the Feski web console, and still allows commandline querying?
> 
> I’ve realized that one can’t simply start up Fuseki, populate a TDB2 DB
> with it, and then point the commandline tool tdb2.tdbquery at that DB -
> that results in "org.apache.jena.dboe.DBOpEnvException: Failed to get a
> lock: …”
> 
> Any guidance would be much appreciated; the docs at jena.apache.org don’t
> seem to directly address this use-case, but I’m hoping I’m
> missing something.
> 
> Thanks!
> 
> 
> [image: email_sig_logo_vert.png]
> 
> Jeff Lerman
> 
> AI Scientist
> 
> Mobile: 510-495-4621
> 
> www.invitae.com
> 
> [image: email_sig_social_linkedin.png]
> 
>


Re: About fuseki2 load performance by java API

2019-07-19 Thread Laura Morales
> tdb2.tdbloader --loader=parallel
>
> but it still becomes random IO (moves disk heads)
>
> I haven't tried it extensively on an HDD - I'd be interested in hearing
> what happens.


oh nice! I completely missed it. I've tried it with a 67GB .nt file from 
LinkedGeoData on the same 750GB HDD but the end result does not seem very 
different. It's difficult to compare exactly with when I tried to load 
wikidata, because I don't see any progress being reported here. I mean I don't 
see any "X triples loaded (Y per second)" kind of message. Anyway it starts at 
full speed, boiling CPU, HDD cooking up my wrist from beneath the plastic case, 
and fans almost generating enough lift to take off. Then it gradually slows 
down. I stopped it after 1 hour. At this point I was seeing less than 10% CPU 
usage, 90% iowait, TDB2 files size ~15GB.


> The proper solution is either to do caching+write ordering


What does this mean in practice? Can I change my input data (eg. sorting 
triples) so that tdb2.tdbloader can overcome the bottleneck with HDDs?



Re: RE: About fuseki2 load performance by java API

2019-07-18 Thread Laura Morales
I had a similar problem when trying to load wikidata on my laptop with 8GB RAM, 
i7 CPU, 750GB HDD. It started fine but then slowed to a crawl after about 100 
million triples. I don't think CPU or RAM are the problem, it's probably to do 
with disk queues or caches or something like that. IIRC when Andy tried to load 
the same dataset on his PC with a 1TB SSD and 16GB RAM, he didn't have those 
problems. Bottom line: try with an SSD/NVMe instead of an HDD.

Besides, it would be nice to have a better way (parallelized) for loading huge 
datasets (trillions of triples).



> Sent: Thursday, July 18, 2019 at 2:08 PM
> From: "Scarlet Remilia" 
> To: "users@jena.apache.org" 
> Subject: RE: About fuseki2 load performance by java API
>
> Thank you for reply!
> 
> 
> 
> The server storage is HDD on local with RAID 10.
> 
> CPU is 4x 14 cores with 28 threads but only one core is used during the load.
> 
> The JVM of fuseki2 is tuned by adding -Xmx=50GB -Xms=50GB and TDB2 used is 
> also tuned by tuning cache size.
> 
> I observed disk IO by iostat, but it seems not utilized much disk IO and also 
> it is observed that memory usage of fuseki2 is increasing after loading every 
> 3 millions triples.
> 
> Fuseki2 is setup as a standalone server by the command below:
> 
> 
> 
> ./fuseki-server –tdb2 –loc=./tdb2dataset –port   -update /fuseki2
> 
> 
> 
> Thank you very much!
> 
> 
> 
> Sent from Mail for Windows 10
> 
> 
> 
> 
> From: Andy Seaborne 
> Sent: Thursday, July 18, 2019 6:41:56 PM
> To: users@jena.apache.org
> Subject: Re: About fuseki2 load performance by java API
> 
> That's quite slow. I get maybe 50-70K triples for a 100m load via the
> Fuseki UI.
> 
> The fastest way is to use the bulk loader directly to setup the
> database, then add it to Fuseki.
> 
> The hardware of the server makes a big difference. What's the server
> setup? Disk/SSD? Local or remote storage?
> 
>  Andy
> 
> You don't need the begin/commit in the client - the transaction is in
> the backend server.
> 
> On 18/07/2019 09:02, Scarlet Remilia wrote:
> > Hello everyone,
> > I want to load a hundred millions triple into TDB2-backend fuseki2 by Java 
> > API.
> > I used code below:
> >
> > Model model = ModelFactory.createDefaultModel();
> > model.add(model.asStatement(triple));
> > RDFConnectionRemoteBuilder builder = RDFConnectionFuseki.create()
> >  .destination(FusekiURL);
> >  RDFConnection conn = builder.build();
> >  conn.begin(ReadWrite.WRITE);
> >  try {
> >  conn.load(model);
> >  conn.commit();
> >  } finally {
> >  conn.end();
> >  }
> >
> > The code is actually worked but performance is not ideal enough.
> >
> > [2019-07-18 23:29:25] Fuseki INFO  [46] POST 
> > http://192.168.204.244:/fuseki2?default
> > [2019-07-18 23:30:45] Fuseki INFO  [15] Body: Content-Length=-1, 
> > Content-Type=application/rdf+thrift, Charset=null => RDF-THRIFT : 
> > Count=3257309 Triples=3257309 Quads=0
> > [2019-07-18 23:31:12] Fuseki INFO  [15] 200 OK (3,302.546 s)
> >
> > Every 3 millions triples cost 3,302.546 seconds and there are totally 300 
> > millions triples in queue…(One in-mem Model is impossible to contain so 
> > much triples…)
> >
> > Is there any better method to load them quicker?
> >
> > Thanks!
> >
> > Sent from Mail for Windows 
> > 10
> >
> >
>


Re: What detail steps are required to Restore a TDB from the back-up file

2019-07-18 Thread Laura Morales
To restore from Fuseki web UI, in the "manage datasets" page there are buttons 
for creating new stores as well as for selecting your files containing the 
triples ("upload data").

>From the CLI you can type "tdbloader --help" for the list of options.
Example: "tdbloader --loc mydb data.nt" will create a new directory "mydb" with 
a TDB store using data from data.nt



> Sent: Thursday, July 18, 2019 at 1:11 AM
> From: "Al Shapiro" 
> To: Jena-users-ml 
> Subject: What detail steps are required to Restore a TDB from the back-up file
>
> Hi guys again!
> How does one restore a TDB Back-up file created via Apache Jena Fuseki 
> utility screen?
> 
> I received part of the answer to the above question, which is "tdbloader" and 
> CLI...
> Thank you all for the above, but I now would appreciate the detail steps 
> (example...) to use the "tdbloader" from the Command Line Interface (CLI) via 
> the Command Prompt to Restore the TDB from the TDB Back-up file...
> Thank you again,
> 
> Al Shapiro
> 
>


Re: How does one restore a TDB Back-up file created via Apache Jena Fuseki...

2019-07-17 Thread Laura Morales
Backups are automatically saved inside $FUSEKI_BASE/backups/ as .nq files. You 
either reload those files from the same web interface or create a new TDB store 
using tdbloader(2). Or you can probably load them into another existing TDB 
store using tdbupdate. BTW you can also backup from the CLI with tdbbackup 
instead of using the web interface.



> Sent: Wednesday, July 17, 2019 at 5:53 AM
> From: "Al Shapiro" 
> To: "users@jena.apache.org" 
> Subject: How does one restore a TDB Back-up file created via Apache Jena 
> Fuseki...
>
> Hi guys!
> How does one restore a TDB Back-up file created via Apache Jena Fuseki 
> utility screen?
>
> I created a TDB Back-up via:
> Apache Jena Fuseki --
> Manage datasets - Perform management actions on existing datasets, including 
> backup, or add a new dataset
>
> What do I need to do to restore the Back-up TDB file to the TDB?
> Thank you,
> Al Shapiro
>
>


Fw: Import rules from another GenericRuleReasoner file

2019-07-05 Thread Laura Morales
@include .

reference: https://jena.apache.org/documentation/inference/#rules




> Sent: Friday, July 05, 2019 at 1:21 PM
> From: "Laura Morales" 
> To: jena-users-ml 
> Subject: Import rules from another GenericRuleReasoner file
>
> I have a file that contains rules for a GenericRuleReasoner. Is it possible 
> to import another file containing more rules, from the former?
>


Import rules from another GenericRuleReasoner file

2019-07-05 Thread Laura Morales
I have a file that contains rules for a GenericRuleReasoner. Is it possible to 
import another file containing more rules, from the former?



Re: RDFDataset dump inferred triples only

2019-07-02 Thread Laura Morales
Thank you.


> Sent: Tuesday, July 02, 2019 at 3:26 PM
> From: "Andy Seaborne" 
> To: users@jena.apache.org
> Subject: Re: RDFDataset dump inferred triples only
>
> Somehow subtract the base triples from the complete data.
>
> If there are no bnodes, get the base and complete to a file and use
> sort(1) and comm(1).
>
> If there are bnodes, some (slow!) query with a GRAPH/FILTER (NOT) EXISTS
> might do it.
>
>  Andy
>
> On 02/07/2019 11:12, Dave Reynolds wrote:
> > On 02/07/2019 11:09, Laura Morales wrote:
> >> Can I do this with one of the Jena command line tools?
> >
> > Not that I know of, don't think there's a command line tool for running
> > a set rules over data.
> >
> > Dave
> >
> >>> Sent: Tuesday, July 02, 2019 at 11:34 AM
> >>> From: "Dave Reynolds" 
> >>> To: users@jena.apache.org
> >>> Subject: Re: RDFDataset dump inferred triples only
> >>>
> >>> On 02/07/2019 09:19, Laura Morales wrote:
> >>>> How can I dump to a .nt files *only* the inferred triples in a
> >>>> RDFDataset?
> >>>> In other words, I have a RDFDataset with a GenericRuleReasoner
> >>>> InfModel, and I would like to export all the inferred triples to a
> >>>> file.
> >>>
> >>> If you are only using forward rules then use getDeductionsModel to get
> >>> just the inferred triples and serialize that.
> >>>
> >>> If you are using backward or hybrid rules then it's trickier. You would
> >>> have to materialize everything the backward rules can find as well,
> >>> remove the starting graph and then serialize that.
> >>>
> >>> Dave
> >>>
> >>>
>


Re: RDFDataset dump inferred triples only

2019-07-02 Thread Laura Morales
Can I do this with one of the Jena command line tools?


> Sent: Tuesday, July 02, 2019 at 11:34 AM
> From: "Dave Reynolds" 
> To: users@jena.apache.org
> Subject: Re: RDFDataset dump inferred triples only
>
> On 02/07/2019 09:19, Laura Morales wrote:
> > How can I dump to a .nt files *only* the inferred triples in a RDFDataset?
> > In other words, I have a RDFDataset with a GenericRuleReasoner InfModel, 
> > and I would like to export all the inferred triples to a file.
>
> If you are only using forward rules then use getDeductionsModel to get
> just the inferred triples and serialize that.
>
> If you are using backward or hybrid rules then it's trickier. You would
> have to materialize everything the backward rules can find as well,
> remove the starting graph and then serialize that.
>
> Dave
>
>


RDFDataset dump inferred triples only

2019-07-02 Thread Laura Morales
How can I dump to a .nt files *only* the inferred triples in a RDFDataset?
In other words, I have a RDFDataset with a GenericRuleReasoner InfModel, and I 
would like to export all the inferred triples to a file.


Fw: GenericRuleReasoner live rule update

2019-06-26 Thread Laura Morales
To explain my problem a little better, I have a program (website) that is used 
by several people with Fuseki in the backend, and I would like to accept 
user-defined inference rules. The only way I know to add new rules is by 
changing the configuration files and reloading Fuseki. This is not ideal for 
two reasons: 1st it requires a database restart for reading in the new 
configuration files, and 2nd if a rule has a syntax error Fuseki stops with an 
exception. I've read in the documentation about ja:rule but I feel like it 
doesn't solve the problem since it too must be defined in the configuration 
files.
I would like to know if there's a way that I can add inference rules simply by 
updating a graph (some kind of Fuseki "configuration graph" with ja:rule 
maybe?) instead of writing the configuration files, or if broken rules can be 
skipped instead of blocking Fuseki.
Thank you so much!


> Sent: Tuesday, June 25, 2019 at 10:39 AM
> From: "Laura Morales" 
> To: jena-users-ml 
> Subject: GenericRuleReasoner live rule update
>
> Is it possible to live-reload GenericRuleReasoner rules? That is without 
> restarting Fuseki?


GenericRuleReasoner live rule update

2019-06-25 Thread Laura Morales
Is it possible to live-reload GenericRuleReasoner rules? That is without 
restarting Fuseki?


Fw: Cannot setup GenericRuleReasoner

2019-06-25 Thread Laura Morales
It seems to work if I replace the rule

[ okrule: (?s a ex:Person) -> (?s ex:works "OK") ]

with this rule

[ okrule: (?s rdf:type ex:Person) -> (?s ex:works "OK") ]

is this a bug?




> Sent: Tuesday, June 25, 2019 at 9:40 AM
> From: "Laura Morales" 
> To: jena-users-ml 
> Subject: Cannot setup GenericRuleReasoner
>
> What's wrong with this configuration? It doesn't seem to infer any triples 
> when I query the dataset (Fuseki 3.6.0). It doesn't show any errors either.
>
> config.ttl
> 
> PREFIX :   <#>
> PREFIX fuseki: <http://jena.apache.org/fuseki#>
> PREFIX ja: <http://jena.hpl.hp.com/2005/11/Assembler#>
> PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX rdfs:   <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX tdb:<http://jena.hpl.hp.com/2008/tdb#>
>
> :service a fuseki:Service ;
> rdfs:label"test" ;
> fuseki:name   "test" ;
> fuseki:serviceQuery   "query" ;
> fuseki:serviceReadGraphStore  "get" ;
> fuseki:serviceReadWriteGraphStore "data" ;
> fuseki:serviceUpdate  "update" ;
> fuseki:serviceUpload  "upload" ;
> fuseki:dataset:dataset ;
> .
>
> :dataset a ja:RDFDataset ;
> ja:defaultGraph :model_inf .
>
> :model_inf a ja:InfModel ;
> ja:baseModel :g ;
> ja:reasoner [
> ja:reasonerURL <http://jena.hpl.hp.com/2003/GenericRuleReasoner> ;
> ja:rulesFrom  ;
> ] .
>
> :ds a tdb:DatasetTDB ;
> tdb:location "/opt/fuseki/run/databases/ds/" .
>
> :g a tdb:GraphTDB ;
> tdb:dataset :ds .
>
>
> rules
> 
> @prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
> @prefix owl:  <http://www.w3.org/2002/07/owl#> .
> @prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .
> @prefix ex:   <https://example.org#> .
>
> [ okrule: (?s a ex:Person) -> (?s ex:works "OK") ]
>


Cannot setup GenericRuleReasoner

2019-06-25 Thread Laura Morales
What's wrong with this configuration? It doesn't seem to infer any triples when 
I query the dataset (Fuseki 3.6.0). It doesn't show any errors either.

config.ttl

PREFIX :   <#>
PREFIX fuseki: 
PREFIX ja: 
PREFIX rdf:
PREFIX rdfs:   
PREFIX tdb:

:service a fuseki:Service ;
rdfs:label"test" ;
fuseki:name   "test" ;
fuseki:serviceQuery   "query" ;
fuseki:serviceReadGraphStore  "get" ;
fuseki:serviceReadWriteGraphStore "data" ;
fuseki:serviceUpdate  "update" ;
fuseki:serviceUpload  "upload" ;
fuseki:dataset:dataset ;
.

:dataset a ja:RDFDataset ;
ja:defaultGraph :model_inf .

:model_inf a ja:InfModel ;
ja:baseModel :g ;
ja:reasoner [
ja:reasonerURL  ;
ja:rulesFrom  ;
] .

:ds a tdb:DatasetTDB ;
tdb:location "/opt/fuseki/run/databases/ds/" .

:g a tdb:GraphTDB ;
tdb:dataset :ds .


rules

@prefix rdf:   .
@prefix rdfs:  .
@prefix owl:   .
@prefix xsd:   .
@prefix ex:    .

[ okrule: (?s a ex:Person) -> (?s ex:works "OK") ]



Re: INSERT if not exists

2019-06-17 Thread Laura Morales
Thank you very much!

(For my future reference: 
https://www.w3.org/TR/rdf-sparql-query/#emptyGroupPattern)


> Sent: Monday, June 17, 2019 at 10:55 AM
> From: "Andy Seaborne" 
> To: users@jena.apache.org
> Subject: Re: INSERT if not exists
>
> On 16/06/2019 21:01, Laura Morales wrote:
> > I would like to add a new node if and only if another node with the same 
> > properties does not exists. Something like this:
> >
> >
> >  INSERT {
> >  ex:item-X a ex:Item;
> >ex:serial "XYZ" .
> >  } if and only if there is not already another node with (ex:serial 
> > "XYZ")
> >
> >
> > After trial and error I got this working
> >
> >
> >  INSERT {
> >  ex:item-X a ex:Item;
> >ex:serial "XYZ" .
> >  }
> >  WHERE {
> >  FILTER NOT EXISTS {
> >  [] a ex:Item;
> > ex:serial "XYZ" .
> >  }
> >  }
> >
> >
> > my problem is that I don't understand why it's working. First of all, is 
> > the query correct?
>
> Yes.
>
> > Second, how does Jena compute this query?
>
> Try running as "SELECT * WHERE" to see that it returns either one row,
> (no variables) if the pattern does not exist or no rows when something
> exists.
>
> INSERT is a loop on the rows of the WHERE
>
> When there is one row, INSERT does soemthing.
> When there are no rows, INSERT does not happen.
>
> > Why does that work, but not this one?
> >
> >  WHERE {
> >  ?s ?p ?o .
> >
> >  FILTER NOT EXISTS {
> >  [] a ex:Item;
> > ex:serial "XYZ" .
> >  }
> >  }
> >
>
> Tested on an empty graph?
>
> This does not work on the empty graph because ?s ?p ?o does to match.
>
> There is an implicit empty pattern in the first update and the empty
> pattern matches (one row, no variables) even on the empty graph.
>
>  Andy
>


INSERT if not exists

2019-06-16 Thread Laura Morales
I would like to add a new node if and only if another node with the same 
properties does not exists. Something like this:


INSERT {
ex:item-X a ex:Item;
  ex:serial "XYZ" .
} if and only if there is not already another node with (ex:serial "XYZ")


After trial and error I got this working


INSERT {
ex:item-X a ex:Item;
  ex:serial "XYZ" .
}
WHERE {
FILTER NOT EXISTS {
[] a ex:Item;
   ex:serial "XYZ" .
}
}


my problem is that I don't understand why it's working. First of all, is the 
query correct? Second, how does Jena compute this query? Why does that work, 
but not this one?

WHERE {
?s ?p ?o .

FILTER NOT EXISTS {
[] a ex:Item;
   ex:serial "XYZ" .
}
}


Fuseki graph constraints

2019-06-16 Thread Laura Morales
Does Fuseki support graph constraints? Something like the equivalent of 
"composite primary keys" in SQL?


Re: Fuseki union of two TDB(2) datasets

2019-06-13 Thread Laura Morales
> JENA-1667

I think my problem is exactly the same one described by Ashley Sommer in 
JENA-1663.


Re: tdb2.tdbsync

2019-06-13 Thread Laura Morales
yes yes of course I can reload everything, that's what I do already. I simply 
thought it might be quite handy, for instance, if I had a folder containing an 
arbitrary number of rdf files, and as these files change I could call a 
tdb2.tdbsync tool that automatically updates a tdb dataset with only the 
changes (instead of reloading everything).


> Sent: Thursday, June 13, 2019 at 10:26 AM
> From: "Rob Vesse" 
> To: users@jena.apache.org
> Subject: Re: tdb2.tdbsync
>
> Can you not just do a fresh TDB load into a new dataset from the data file?
>
> This would be much faster and more performant than what you are proposing (in 
> particular the delete handling would be very expensive)
>
> Rob



tdb2.tdbsync

2019-06-12 Thread Laura Morales
This is only a potential suggestion, not an issue.

I think it could be handy to have a tdb2.tdbsync tool for synchronizing a TDB 
dataset with a rdf file(s). Something to use like this: tdb2.tdbsync --loc 
dataset data.nt, and will automatically delete/insert tuples to maintain the 
dataset updated with the changes in the file. It would be handy when batch 
processing a large number of tuples.


Re: Fuseki union of two TDB(2) datasets

2019-06-12 Thread Laura Morales
Thank you!

> Sent: Thursday, June 13, 2019 at 1:05 AM
> From: "ajs6f" 
> To: users@jena.apache.org
> Subject: Re: Fuseki union of two TDB(2) datasets
>
> Filed as https://issues.apache.org/jira/browse/JENA-1721.
>
> ajs6f


Re: Fuseki union of two TDB(2) datasets

2019-06-12 Thread Laura Morales
> > PS: this only works with TDB1 right? Any way to make it work with TDB2?
>
> No. Works for anything and mixtures:
>
> Change the tdb:GraphTDB and tdb:DatasetTDB
> to tdb2:GraphTDB2 and tdb2:DatasetTDB2

TDB1 works, but with TDB2 all I get is a waning: "500 : Not in a transaction".
This is the assembler:

<#dataset> a ja:RDFDataset;
ja:graph <#all> ;
.

<#all> a ja:UnionModel ;
ja:subModel <#graph1> ;
ja:subModel <#graph2> ;
.

<#graph1> a tdb2:GraphTDB2 ;
tdb2:dataset <#ds1> ;
#tdb2:graphName  ;
.

<#graph2> a tdb2:GraphTDB2 ;
tdb2:dataset <#ds2> ;
#tdb2:graphName  ;
.

<#ds1> a tdb2:DatasetTDB2 ;
tdb2:location "../DB1" ;
.

<#ds2> a tdb2:DatasetTDB2 ;
tdb2:location "../DB2" ;
.

And output, with or without :

$ ./fuseki-server --verbose --debug
[2019-06-13 05:29:06] QueryIteratorCheck WARN  Open iterator: TripleMapper/83
[2019-06-13 05:29:06] Fuseki WARN  [4] RC = 500 : Not in a transaction
org.apache.jena.dboe.transaction.txn.TransactionException: Not in a transaction
at 
org.apache.jena.tdb2.store.DatasetGraphTDB.requireTxn(DatasetGraphTDB.java:168)
at 
org.apache.jena.tdb2.store.DatasetGraphTDB.findInDftGraph(DatasetGraphTDB.java:101)
at 
org.apache.jena.sparql.core.DatasetGraphBaseFind.find(DatasetGraphBaseFind.java:47)
at 
org.apache.jena.sparql.core.DatasetGraphWrapper.find(DatasetGraphWrapper.java:181)
at 
org.apache.jena.sparql.core.GraphView.graphBaseFind(GraphView.java:121)
at 
org.apache.jena.sparql.core.GraphView.graphBaseFind(GraphView.java:113)
at org.apache.jena.graph.impl.GraphBase.find(GraphBase.java:241)
at 
org.apache.jena.graph.compose.MultiUnion.multiGraphFind(MultiUnion.java:170)
at 
org.apache.jena.graph.compose.MultiUnion.graphBaseFind(MultiUnion.java:147)
at org.apache.jena.graph.impl.GraphBase.find(GraphBase.java:241)
at 
org.apache.jena.graph.impl.GraphBase.graphBaseFind(GraphBase.java:258)
at org.apache.jena.graph.impl.GraphBase.find(GraphBase.java:255)
at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.(QueryIterTriplePattern.java:75)
at 
org.apache.jena.sparql.engine.iterator.QueryIterTriplePattern.nextStage(QueryIterTriplePattern.java:49)
at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:108)
at 
org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:65)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at 
org.apache.jena.sparql.engine.iterator.QueryIterBlockTriples.hasNextBinding(QueryIterBlockTriples.java:63)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at 
org.apache.jena.sparql.engine.iterator.QueryIterConvert.hasNextBinding(QueryIterConvert.java:58)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at 
org.apache.jena.sparql.engine.iterator.QueryIterDistinct.getInputNextUnseen(QueryIterDistinct.java:104)
at 
org.apache.jena.sparql.engine.iterator.QueryIterDistinct.hasNextBinding(QueryIterDistinct.java:70)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at 
org.apache.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:39)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:39)
at 
org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at 
org.apache.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:74)
at 
org.apache.jena.sparql.engine.ResultSetCheckCondition.hasNext(ResultSetCheckCondition.java:55)
at 
org.apache.jena.fuseki.servlets.SPARQL_Query.executeQuery(SPARQL_Query.java:341)
at 
org.apache.jena.fuseki.servlets.SPARQL_Query.execute(SPARQL_Query.java:290)
at 
org.apache.jena.fuseki.servlets.SPARQL_Query.executeWithParameter(SPARQL_Query.java:239)
at 
org.apache.jena.fuseki.servlets.SPARQL_Query.perform(SPARQL_Query.java:224)
at 
org.apache.jena.fuseki.servlets.ActionService.executeLifecycle(ActionService.java:266)
at 
org.apache.jena.fuseki.servlets.ActionService.execCommonWorker(ActionService.java:155)
at 
org.apache.jena.fuseki.servlets.ActionBase.doCommon(ActionBase.java:74)
at 

Re: Fuseki union of two TDB(2) datasets

2019-06-12 Thread Laura Morales
> > The reason why I would like to be able to do this, instead of merging one 
> > dataset into another, is simply because the two datasets come from 
> > different origins.
>
> With an overlap of named graphs?

Potentially yes, but for my particular case I can work around it by renaming my 
graphs if it makes any difference.

I've done a quick test using your assembler with ja:UnionModel, and using 2 TDB 
datasets that I've created like this: tdbloader2 --loc dataset-1 data.nt 
(data.nt contains only 1 unnamed graph). SELECT..WHERE queries return an empty 
results set, but it *does* seem to work if I remove the tdb:graphName 
 properties.

Where can I find more documentation about ja:UnionModel? I've never used it 
before and I don't really understand what's going on...

PS: this only works with TDB1 right? Any way to make it work with TDB2?



Re: Fuseki union of two TDB(2) datasets

2019-06-12 Thread Laura Morales
> From the Java side, there is UnionDatasetGraph, which would be able to give 
> you a view over two datasets, but I'm not sure we ever wrote an assembler for 
> that. I don't see one in the codebase, although it shouldn't be too hard. 
> There are DatasetDescriptionAssembler and UnionModelAssembler and you might 
> be able to cobble something together from them.

Changing Jena/Fuseki Java source code is way above what I'm capable of doing. 
Reading the documentation, I might assemble an ja:RDFDataset but does it 
support any property like tdb:unionDefaultGraph?

<#dataset> rdf:type ja:RDFDataset ;
tdb:unionDefaultGraph true ; <<
ja:namedGraph
[ ja:graphName  ;
  ja:graph <#graph1> ] ;
ja:namedGraph
[ ja:graphName  ;
  ja:graph <#graph2> ] ;
.

<#graph1> rdf:type tdb:GraphTDB ;
tdb:location "DB-1" ;
.

<#graph2> rdf:type tdb:GraphTDB ;
tdb:location "DB-2" ;
.


Re: Fuseki union of two TDB(2) datasets

2019-06-12 Thread Laura Morales
Basically I have a Fuseki server with 1 TDB dataset containing multiple graphs, 
like this (stripped down configuration):


<#service> rdf:type fuseki:Service ;
fuseki:dataset <#dataset> ;
.

<#dataset> rdf:type tdb:DatasetTDB ;
tdb:location "DB" ;
tdb:unionDefaultGraph true ;
.


So the default graph is a graph with tuples from all the named graphs in the 
same dataset. What I would like to do is simply extend the default graph to 
include tuples from another dataset, if it's possible. Something like this 
(pseudo configuration):


<#service> rdf:type fuseki:Service ;
fuseki:dataset <#dataset-1> ;
fuseki:dataset <#dataset-2> ;
ja:defaultGraph <#dataset-1>, <#dataset-2> ;
.

<#dataset-1> rdf:type tdb:DatasetTDB ;
tdb:location "DB-1" ;
.

<#dataset-2> rdf:type tdb:DatasetTDB ;
tdb:location "DB-2" ;
.


The reason why I would like to be able to do this, instead of merging one 
dataset into another, is simply because the two datasets come from different 
origins. I control one dataset, but I'm importing the other; however I would 
like to SELECT..WHERE over all graphs combined. Merging the 3rd party dataset 
into my dataset is something that I would like to avoid because it adds a lot 
of complexity, so I was wondering if I can configure Fuseki to automatically 
use two distinct datasets as a union default graph.
Is this possible with either TDB or TDB2? I hope my question is clear and makes 
sense. Thank you.




> Sent: Wednesday, June 12, 2019 at 1:08 PM
> From: "Andy Seaborne" 
> To: users@jena.apache.org
> Subject: Re: Fuseki union of two TDB(2) datasets
>
> What's the use case here?
>
> You can have a general dataset which has a default graph being the union
> of two graphs, each from a TDB1/TSDB2 dataset.
>
> But maybe there is a better way for the use case.
>
>  Andy


Fuseki union of two TDB(2) datasets

2019-06-12 Thread Laura Morales
Can I configure Fuseki to use the union of two TDB or TDB2 datasets as the 
default union graph?


  1   2   3   4   5   >