Re: Requesting advice on Fuseki memory settings

2024-03-07 Thread Martynas Jusevičius
If it helps, I have a setup I have used to profile Fuseki in VisualVM:
https://github.com/AtomGraph/fuseki-docker



On Thu, 7 Mar 2024 at 22.55, Andy Seaborne  wrote:

>
>
> On 07/03/2024 13:24, Gaspar Bartalus wrote:
> > Dear Jena support team,
> >
> > We would like to ask you to help us in configuring the memory for our
> > jena-fuseki instance running in kubernetes.
> >
> > *We have the following setup:*
> >
> > * Jena-fuseki deployed as StatefulSet to a k8s cluster with the
> > resource config:
> >
> > Limits:
> >   cpu: 2
> >   memory:  16Gi
> > Requests:
> >   cpu: 100m
> >   memory:  11Gi
> >
> > * The JVM_ARGS has the following value: -Xmx10G
> >
> > * Our main dataset of type TDB2 contains ~1 million triples.
> A million triples doesn't take up much RAM even in a memory dataset.
>
> In Java, the JVM will grow until it is close to the -Xmx figure. A major
> GC will then free up a lot of memory. But the JVM does not give the
> memory back to the kernel.
>
> TDB2 does not only use heap space. A heap of 2-4G is usually enough per
> dataset, sometimes less (data shape depenendent - e.g. many large
> literals used more space.
>
> Use a profiler to examine the heap in-use, you'll probably see a
> saw-tooth shape.
> Force a GC and see the level of in-use memory afterwards.
> Add some safety margin and work space for requests and try that as the
> heap size.
>
> > *  We execute the following type of UPDATE operations:
> >- There are triggers in the system (e.g. users of the application
> > changing the data) which start ~50 other update operations containing
> > up to ~30K triples. Most of them run in parallel, some are delayed
> > with seconds or minutes.
> >- There are scheduled UPDATE operations (executed on hourly basis)
> > containing 30K-500K triples.
> >- These UPDATE operations usually delete and insert the same amount
> > of triples in the dataset. We use the compact API as a nightly job.
> >
> > *We are noticing the following behaviour:*
> >
> > * Fuseki consumes 5-10G of heap memory continuously, as configured in
> > the JVM_ARGS.
> >
> > * There are points in time when the volume usage of the k8s container
> > starts to increase suddenly. This does not drop even though compaction
> > is successfully executed and the dataset size (triple count) does not
> > increase. See attachment below.
> >
> > *Our suspicions:*
> >
> > * garbage collection in Java is often delayed; memory is not freed as
> > quickly as we would expect it, and the heap limit is reached quickly
> > if multiple parallel queries are run
> > * long running database queries can send regular memory to Gen2, that
> > is not actively cleaned by the garbage collector
> > * memory-mapped files are also garbage-collected (and perhaps they
> > could go to Gen2 as well, using more and more storage space).
> >
> > Could you please explain the possible reasons behind such a behaviour?
> > And finally could you please suggest a more appropriate configuration
> > for our use case?
> >
> > Thanks in advance and best wishes,
> > Gaspar Bartalus
> >
>


Re: Checking that SPARQL Update will not validate SHACL constraints

2023-12-14 Thread Martynas Jusevičius
Arne’s email got lost somehow but I see it in Andy’s reply.

Thanks for the suggestions.

On Wed, 13 Dec 2023 at 19.52, Andy Seaborne  wrote:

>
>
> On 13/12/2023 15:49, Arne Bernhardt wrote:
> > Hello Martynas,
> >
> > I have no experience with implementing a validation layer for Fuseki.
> >
> > But I might have an idea for your suggested approach:
> > Instead of loading a copy of the graph and modifying it, you could create
> > an org.apache.jena.graph.compose.Delta based on the unmodified graph.
> > Then apply the update to the delta graph and validate the SHACL on the
> > delta graph. If the validation is successful, you can safely apply the
> > update to the original graph and discard the delta graph.
> >
> > You still have to deal with concurrency. For example, the original graph
> > could be changed by a second, faster update while you are still
> validating
> > the first update. It would not be safe to apply the validated changes to
> a
> > graph that has been changed in the meantime.
> >
> > Arne
>
> It'll depends in the SHACL. Many constraints don't need all the data
> available. Some need just the subject and all properties (e.g.
> sh:maxCount). Some need all the data (SPARQL ones - they are opaque to
> analysis so the general way is they need all the data).
>
> If the proxy layer is same JVM, BufferingDatasetGraph may help.
> It can be used to capture the adds and deletes. It can then be validated
> (all data or only the data changing). Flush the changes to the database
> just before the end of the request in the proxy level commit.
>
> If the proxy is in a different JVM, then only certain constraints can be
> supported but they do tend to be the most common checks.
>
>  Andy
>
> >
> >
> >
> >
> > Am Mi., 13. Dez. 2023 um 14:29 Uhr schrieb Martynas Jusevičius <
> > marty...@atomgraph.com>:
> >
> >> Hi,
> >>
> >> I have an objective to only persist constraint-validated data in Fuseki.
> >>
> >> I have a proxy layer that validates all incoming GSP PUT and POST
> >> request graphs in memory and rejects the invalid ones. So far so good.
> >>
> >> What about SPARQL Update requests though? For simplicity's sake, let's
> >> say they are restricted to a single graph as in GSP PATCH [1].
> >> What I can think of is first loading the graph into memory and
> >> executing the update, and then validating the resulting graph against
> >> SHACL. But maybe there's a smarter way?
> >>
> >> Also interested in the more general case without the graph restriction.
> >>
> >> Martynas
> >>
> >> [1] https://www.w3.org/TR/sparql11-http-rdf-update/#http-patch
> >>
> >
>


Checking that SPARQL Update will not validate SHACL constraints

2023-12-13 Thread Martynas Jusevičius
Hi,

I have an objective to only persist constraint-validated data in Fuseki.

I have a proxy layer that validates all incoming GSP PUT and POST
request graphs in memory and rejects the invalid ones. So far so good.

What about SPARQL Update requests though? For simplicity's sake, let's
say they are restricted to a single graph as in GSP PATCH [1].
What I can think of is first loading the graph into memory and
executing the update, and then validating the resulting graph against
SHACL. But maybe there's a smarter way?

Also interested in the more general case without the graph restriction.

Martynas

[1] https://www.w3.org/TR/sparql11-http-rdf-update/#http-patch


Re: Problem running AtomGraph/fuseki-docker

2023-12-06 Thread Martynas Jusevičius
Hi Steve,

This looks like Windows shell issue.

For some reason /ds is resolved as a filepath where it shouldn’t.

Can you try —mem '/ds' with quotes?

I’m running Docker on WSL2 and never had this problem.

Martynas

On Wed, 6 Dec 2023 at 21.05, Steve Vestal  wrote:

> I am running a VM with Microsoft Windows Server 2019 (64-bit). When I
> try to stand up the docker server, I get
>
> $ docker run --rm -p 3030:3030 atomgraph/fuseki --mem /ds
> String '/C:/Program Files/Git/ds' not valid as 'service'
>
> Suggestions?
>
>


Re: Querying URL with square brackets

2023-11-24 Thread Martynas Jusevičius
On Fri, Nov 24, 2023 at 12:50 PM Laura Morales  wrote:
>
> > If you want a page for every book, don't use fragment URIs. Use
> > http://example.org/book/1 or http://example.org/book/1#this instead of
> >  http://example.org/book#1.
>
> yes yes I agree with this. I only tried to present an example of yet another 
> "quirk" between raw data and browsers (where this kind of data is supposed to 
> be used).

Still don't understand the problem :) http://example.org/book#1
uniquely identifies a resource, but you'll need to get the whole
http://example.org/book document to retrieve it. That's just how HTTP
works.


Re: Querying URL with square brackets

2023-11-24 Thread Martynas Jusevičius
On Fri, Nov 24, 2023 at 11:46 AM Laura Morales  wrote:
>
> > > in the case that I want to use these URLs with a web browser.
> >
> > I don't understand what the trouble with the above example is?
>
> The problem with # is that browsers treat them as the start of a local 
> reference. When you open http://example.org/book#1 the server only receives 
> http://example.org/book. In other words it would be an error to create nodes 
> for n different books (#1 #2 #3 #n) if my goal is also to use these URLs with 
> a browser (for example if I want to show one page for every book). It's not a 
> problem with Jena, it's a problem with the way browsers treat the fragment.

If you want a page for every book, don't use fragment URIs. Use
http://example.org/book/1 or http://example.org/book/1#this instead of
 http://example.org/book#1.


Re: Querying URL with square brackets

2023-11-24 Thread Martynas Jusevičius
On Fri, Nov 24, 2023 at 10:31 AM Laura Morales  wrote:
>
> Thank you a lot. FILTER(STR(?id) = "...") works, as suggested by Andy. I do 
> recognize though that it is a hack, and that URLs should probably not have a 
> [.
>
> But now I have trouble understanding UTF8 addresses. I would use random 
> alphanumeric URLs everywhere if I could, or I would %-encode everything. But 
> nodes IDs (URLs) are supposed to be valid, human-readable URLs because 
> they're used online. Jena, and browsers, work fine with IRIs (which are 
> UTF8), but the way special characters are used is not the same. For example 
> it's perfectly fine in my graph to have a URL fragment, such as 
> http://example.org/foo#bar but these URLs are not usable with a browser 
> because the fragment is a local reference (local to the browser) that is not 
> sent to the server. Which means in practice, that if I want to stay out of 
> trouble I should not create a graph with IDs
>
> http://example.org/book#1
> http://example.org/book#2
> http://example.org/book#3
>
> in the case that I want to use these URLs with a web browser.

I don't understand what the trouble with the above example is?

> Viceversa, browsers are perfectly fine with a [ in the path, but Jena is 
> stricter.

It's not Jena that's stricter, it's the standard specifications. Or
you can say browsers are too lax. They use their own WHATWG URL
"specification".
Sometimes the URL you see in the address bar is not the actual URL
being sent to the server.

>
> So, if I want to use UTF8 addresses (IRIs) in my graph, and if I don't want 
> to %-encode them because I want them to be human-readbale (also because they 
> are much easier to read/edit manually), what is the list of characters that 
> MUST be %-encoded?
>
>
> > Sent: Friday, November 24, 2023 at 9:55 AM
> > From: "Marco Neumann" 
> > To: users@jena.apache.org
> > Subject: Re: Querying URL with square brackets
> >
> > Laura, see jena issue #2102
> > https://github.com/apache/jena/issues/2102
> >
> > Marco


Re: Ever-increasing memory usage in Fuseki

2023-11-01 Thread Martynas Jusevičius
There were several long threads about this issue in the past months. I
think the consensus was it's Jetty-related, but I don't know if the issue
is addressed.

https://lists.apache.org/thread/31ytzp3p2zg3gcsm86t1xlh4nsmdcfkc
https://lists.apache.org/thread/b64trj1c9n9rt0xjowqt4j23h9cy3v4c
https://lists.apache.org/thread/m7ypdsndjosxmdsxp9ch437305qw9mwd

On Wed, Nov 1, 2023 at 8:43 PM Hugo Mills 
wrote:

> Hi,
>
>
>
> We’ve got an application we’ve inherited recently which uses a Fuseki
> database. It was originally Fuseki 3.4.0, and has been upgraded to 4.9.0
> recently. The 3.4.0 server needed regular restarts (once a day) in order to
> keep working; the 4.9.0 server is even more unreliable, and has been
> running out of memory and being OOM-killed multiple times a day. This
> afternoon, it crashed enough times, fast enough, to make Kubernetes go into
> a back-off loop, and brought the app down for some time.
>
>
>
> We’re using OpenJDK 19. The JVM options are: “-Xmx:30g -Xms18g”, and the
> container we’re running it in has a memory limit of 31 GiB. We tried the
> “-XX:+UserSerialGC” option this evening, but it didn’t seem to help much.
> We see the RAM usage of the java process rising steadily as queries are
> made, with occasional small, but insufficient, drops.
>
> The store is somewhere around 20M triples in size.
>
>
>
> Could anyone suggest any tweaks or options we could do to make this more
> stable, and not leak memory? We’ve downgraded to 3.4.0 again, and it’s not
> running out of space every few minutes at least, but it still has an
> ever-growing memory usage.
>
>
>
> Thanks,
>
> Hugo.
>
>
>
> *Dr. Hugo Mills*
>
> Senior Data Scientist
>
> hugo.mi...@agrimetrics.co.uk
>
>
> [image: Text Description automatically generated]
>
> *NEWS: *Visit our Data Marketplace
>  to explore our agrifood data
> catalogue.
>
> www.agrimetrics.co.uk
> [image: Icon Description automatically
> generated] [image: Icon
> Description automatically generated with medium confidence]
> 
>
>
>
>
>
>
>


Re: HTTP QueryExecution has been closed

2023-10-27 Thread Martynas Jusevičius
What I meant: can try-with-resources call qex.close() before
constructQuads even does the checkNotClosed() check?

On Fri, Oct 27, 2023 at 11:31 PM Martynas Jusevičius
 wrote:
>
> Can try-with-resources call qex.close() before constructDataset() does here?
> https://github.com/apache/jena/blob/main/jena-arq/src/main/java/org/apache/jena/sparql/exec/QueryExecDataset.java#L232
>
> On Fri, Oct 27, 2023 at 11:18 PM Martynas Jusevičius
>  wrote:
> >
> > Hi,
> >
> > I'm trying to understand in which circumstances can the following code
> >
> > try (QueryExecution qex = QueryExecution.create(getQuery(), 
> > rowModel))
> > {
> > return qex.execConstructDataset();
> > }
> >
> > throw the "HTTP QueryExecution has been closed" exception?
> > Full code here:
> > https://github.com/AtomGraph/LinkedDataHub/blob/rf-direct-graph-ids-only/src/main/java/com/atomgraph/linkeddatahub/imports/stream/csv/CSVGraphStoreRowProcessor.java#L141
> >
> > The execution is not even happening over HTTP? Is it somehow closed 
> > prematurely?
> >
> > I can see the exception being thrown in QueryExecDataset::constructQuads:
> > https://github.com/apache/jena/blob/main/jena-arq/src/main/java/org/apache/jena/sparql/exec/QueryExecDataset.java#L211
> >
> > Martynas


Re: HTTP QueryExecution has been closed

2023-10-27 Thread Martynas Jusevičius
Can try-with-resources call qex.close() before constructDataset() does here?
https://github.com/apache/jena/blob/main/jena-arq/src/main/java/org/apache/jena/sparql/exec/QueryExecDataset.java#L232

On Fri, Oct 27, 2023 at 11:18 PM Martynas Jusevičius
 wrote:
>
> Hi,
>
> I'm trying to understand in which circumstances can the following code
>
> try (QueryExecution qex = QueryExecution.create(getQuery(), rowModel))
> {
> return qex.execConstructDataset();
> }
>
> throw the "HTTP QueryExecution has been closed" exception?
> Full code here:
> https://github.com/AtomGraph/LinkedDataHub/blob/rf-direct-graph-ids-only/src/main/java/com/atomgraph/linkeddatahub/imports/stream/csv/CSVGraphStoreRowProcessor.java#L141
>
> The execution is not even happening over HTTP? Is it somehow closed 
> prematurely?
>
> I can see the exception being thrown in QueryExecDataset::constructQuads:
> https://github.com/apache/jena/blob/main/jena-arq/src/main/java/org/apache/jena/sparql/exec/QueryExecDataset.java#L211
>
> Martynas


HTTP QueryExecution has been closed

2023-10-27 Thread Martynas Jusevičius
Hi,

I'm trying to understand in which circumstances can the following code

try (QueryExecution qex = QueryExecution.create(getQuery(), rowModel))
{
return qex.execConstructDataset();
}

throw the "HTTP QueryExecution has been closed" exception?
Full code here:
https://github.com/AtomGraph/LinkedDataHub/blob/rf-direct-graph-ids-only/src/main/java/com/atomgraph/linkeddatahub/imports/stream/csv/CSVGraphStoreRowProcessor.java#L141

The execution is not even happening over HTTP? Is it somehow closed prematurely?

I can see the exception being thrown in QueryExecDataset::constructQuads:
https://github.com/apache/jena/blob/main/jena-arq/src/main/java/org/apache/jena/sparql/exec/QueryExecDataset.java#L211

Martynas


Re: How to reconstruct a Literal from a SPARQL SELECT row element?

2023-10-26 Thread Martynas Jusevičius
OK the documentation is not exhaustive... The 1-argument
createTypedLiteral() attempts to infer the RDF datatype from Java
type, where 'objectDataValue' is a String I'm guessing? Which becomes
xsd:string.
If you want to override this with an explicit datatype URI, you need
the 2-argument version:
https://jena.apache.org/documentation/javadoc/jena/org.apache.jena.core/org/apache/jena/rdf/model/Model.html#createTypedLiteral(java.lang.String,org.apache.jena.datatypes.RDFDatatype)

On Thu, Oct 26, 2023 at 12:55 PM Steve Vestal  wrote:
>
> Literal dataLiteral = resultGraph.createTypedLiteral(objectDataValue);
> System.err.println("objectLiteral: " + objectDataValue + " " +
> dataLiteral.getDatatypeURI());
>
> always says type is http://www.w3.org/2001/XMLSchema#string
>
>
> On 10/26/2023 5:26 AM, Martynas Jusevičius wrote:
> > You need Model::createTypedLiteral
> > https://jena.apache.org/documentation/notes/typed-literals.html#basic-api-operations
> >
> > On Thu, 26 Oct 2023 at 12.24, Steve Vestal  wrote:
> >
> >> If I reconstruct using
> >>
> >>Literal dataLiteral = resultGraph.createLiteral(objectDataValue);
> >>
> >> it always says the type is string
> >>
> >>   1^^xsd:string
> >>   stringB^^ xsd:string
> >>   123.456^^xsd:string
> >>   2023-10-06T12:05:10Z^^xsd:string
> >>
> >> On 10/26/2023 4:17 AM, Steve Vestal wrote:
> >>> What is the best way to reconstruct a typed Literal from a SPARQL
> >>> SELECT result?
> >>>
> >>> I have a SPARQL SELECT query issued against an OntModel in this way:
> >>>
> >>>   QueryExecution structureRowsExec =
> >>> QueryExecutionFactory.create(structureRowsQuery, owlOntModel);
> >>>
> >>> Here are some example triples in the query:
> >>>
> >>>?a2
> >>> <
> >> http://www.galois.com/indigo/test/structure_datatypes_test#floatProperty>
> >>> ?dataVar1.
> >>>?a2
> >>> <
> >> http://www.galois.com/indigo/test/structure_datatypes_test#dateTimeProperty>
> >>
> >>> ?dataVar2.
> >>>
> >>> The OntModel being queried was created using typed literals, e.g.,
> >>>
> >>>
> >>>  DataPropertyAssertion( struct:floatProperty struct:indivA2
> >>> "123.456"^^xsd:float )
> >>>  DataPropertyAssertion( struct:dateTimeProperty struct:indivA2
> >>> "2023-10-06T12:05:10Z"^^xsd:dateTime )
> >>>
> >>> When I look at the ?dataVar1 and ?dataVar2 results in a row, I get
> >>> things like:
> >>>
> >>>   1
> >>>   stringB
> >>>   123.456
> >>>   2023-10-06T12:05:10Z
> >>>
> >>> What is a good way to reconstruct a typed Literal from the query
> >>> results? Is there a SPARQL option to show full typed literal strings?
> >>> Something that can be added to the query?  A utility method that can
> >>> identify the XSD schema simple data type when given a result value
> >>> string?
> >>>
> >>>


Re: How to reconstruct a Literal from a SPARQL SELECT row element?

2023-10-26 Thread Martynas Jusevičius
You need Model::createTypedLiteral
https://jena.apache.org/documentation/notes/typed-literals.html#basic-api-operations

On Thu, 26 Oct 2023 at 12.24, Steve Vestal  wrote:

> If I reconstruct using
>
>   Literal dataLiteral = resultGraph.createLiteral(objectDataValue);
>
> it always says the type is string
>
>  1^^xsd:string
>  stringB^^ xsd:string
>  123.456^^xsd:string
>  2023-10-06T12:05:10Z^^xsd:string
>
> On 10/26/2023 4:17 AM, Steve Vestal wrote:
> > What is the best way to reconstruct a typed Literal from a SPARQL
> > SELECT result?
> >
> > I have a SPARQL SELECT query issued against an OntModel in this way:
> >
> >  QueryExecution structureRowsExec =
> > QueryExecutionFactory.create(structureRowsQuery, owlOntModel);
> >
> > Here are some example triples in the query:
> >
> >   ?a2
> > <
> http://www.galois.com/indigo/test/structure_datatypes_test#floatProperty>
> > ?dataVar1.
> >   ?a2
> > <
> http://www.galois.com/indigo/test/structure_datatypes_test#dateTimeProperty>
>
> > ?dataVar2.
> >
> > The OntModel being queried was created using typed literals, e.g.,
> >
> >
> > DataPropertyAssertion( struct:floatProperty struct:indivA2
> > "123.456"^^xsd:float )
> > DataPropertyAssertion( struct:dateTimeProperty struct:indivA2
> > "2023-10-06T12:05:10Z"^^xsd:dateTime )
> >
> > When I look at the ?dataVar1 and ?dataVar2 results in a row, I get
> > things like:
> >
> >  1
> >  stringB
> >  123.456
> >  2023-10-06T12:05:10Z
> >
> > What is a good way to reconstruct a typed Literal from the query
> > results? Is there a SPARQL option to show full typed literal strings?
> > Something that can be added to the query?  A utility method that can
> > identify the XSD schema simple data type when given a result value
> > string?
> >
> >
>


Re: How to reconstruct a Literal from a SPARQL SELECT row element?

2023-10-26 Thread Martynas Jusevičius
I think DATATYPE() is what you are looking for:
https://kgdev.net/specifications/sparql11-query/expressions/SparqlOps/func-rdfTerms/#func-datatype


On Thu, 26 Oct 2023 at 11.17, Steve Vestal  wrote:

> What is the best way to reconstruct a typed Literal from a SPARQL SELECT
> result?
>
> I have a SPARQL SELECT query issued against an OntModel in this way:
>
>   QueryExecution structureRowsExec =
> QueryExecutionFactory.create(structureRowsQuery, owlOntModel);
>
> Here are some example triples in the query:
>
>?a2
> 
>
> ?dataVar1.
>?a2
> <
> http://www.galois.com/indigo/test/structure_datatypes_test#dateTimeProperty>
>
> ?dataVar2.
>
> The OntModel being queried was created using typed literals, e.g.,
>
>
>  DataPropertyAssertion( struct:floatProperty struct:indivA2
> "123.456"^^xsd:float )
>  DataPropertyAssertion( struct:dateTimeProperty struct:indivA2
> "2023-10-06T12:05:10Z"^^xsd:dateTime )
>
> When I look at the ?dataVar1 and ?dataVar2 results in a row, I get
> things like:
>
>   1
>   stringB
>   123.456
>   2023-10-06T12:05:10Z
>
> What is a good way to reconstruct a typed Literal from the query
> results? Is there a SPARQL option to show full typed literal strings?
> Something that can be added to the query?  A utility method that can
> identify the XSD schema simple data type when given a result value string?
>
>
>


Re: General RDFDatatype factory?

2023-10-24 Thread Martynas Jusevičius
Maybe this?
https://jena.apache.org/documentation/notes/typed-literals.html#user-defined-non-xsd-data-types


On Tue, 24 Oct 2023 at 19.17, Steve Vestal  wrote:

> https://jena.apache.org/documentation/notes/typed-literals.html#xsd says
> "These are all available as static member variables
> from|org.apache.jena.datatypes.xsd.XSDDatatype|
> <
> https://jena.apache.org/documentation/javadoc/jena/org/apache/jena/datatypes/xsd/XSDDatatype.html>"
>
> but that link is Not Found.
>
> On 10/24/2023 11:38 AM, Steve Vestal wrote:
> >
> > Is there a factory to create an RDFDatatype using a URL, e.g.,
> > http://www.w3.org/2001/XMLSchema#integer ?
> >
> > The closest I have found so far is javax.xml.datatype.DatatypeFactory,
> > but that only supports a specific list.
> >
> >


Re: General RDFDatatype factory?

2023-10-24 Thread Martynas Jusevičius
Ah sorry Steve, I think I misread your question :)

On Tue, 24 Oct 2023 at 19.07, Martynas Jusevičius 
wrote:

>
> https://jena.apache.org/documentation/javadoc/jena/org.apache.jena.core/org/apache/jena/rdf/model/ResourceFactory.html#createTypedLiteral(java.lang.String,org.apache.jena.datatypes.RDFDatatype)
>
>
> On Tue, 24 Oct 2023 at 18.38, Steve Vestal 
> wrote:
>
>> Is there a factory to create an RDFDatatype using a URL, e.g.,
>> http://www.w3.org/2001/XMLSchema#integer ?
>>
>> The closest I have found so far is javax.xml.datatype.DatatypeFactory,
>> but that only supports a specific list.
>>
>>


Re: General RDFDatatype factory?

2023-10-24 Thread Martynas Jusevičius
https://jena.apache.org/documentation/javadoc/jena/org.apache.jena.core/org/apache/jena/rdf/model/ResourceFactory.html#createTypedLiteral(java.lang.String,org.apache.jena.datatypes.RDFDatatype)


On Tue, 24 Oct 2023 at 18.38, Steve Vestal  wrote:

> Is there a factory to create an RDFDatatype using a URL, e.g.,
> http://www.w3.org/2001/XMLSchema#integer ?
>
> The closest I have found so far is javax.xml.datatype.DatatypeFactory,
> but that only supports a specific list.
>
>


Re: read-only Fuseki TDB2

2023-09-18 Thread Martynas Jusevičius
This looks like the configuration that you need:
https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html#read-only-service

On Mon, Sep 18, 2023 at 2:43 PM Jim Balhoff  wrote:
>
> Hi,
>
> Is it possible to run a Fuseki server using a read-only TDB2 directory? I’d 
> like to run a query-only SPARQL endpoint, no updates. However I get an 
> exception at startup if the filesystem is read-only. Does Fuseki need to 
> acquire the lock even if updates are turned off?
>
> Thank you,
> Jim
>


Re: Mystery memory leak in fuseki

2023-08-31 Thread Martynas Jusevičius
Does Fuseki have direct code dependency on Jetty? Or would it be possible
to try switching to a different servlet container such as Tomcat?

JAX-RS, which I’ve advocated here multiple times, provides such a
higher-level abstraction above servlets that would enable easy switching.

On Fri, 25 Aug 2023 at 16.18, Dave Reynolds 
wrote:

> On 25/08/2023 11:44, Andy Seaborne wrote:
> >
> >
> > On 03/07/2023 14:20, Dave Reynolds wrote:
> >> We have a very strange problem with recent fuseki versions when
> >> running (in docker containers) on small machines. Suspect a jetty
> >> issue but it's not clear.
> >
> >  From the threads here, it does seem to be Jetty related.
>
> Yes.
>
> We've followed up on Rob's suggestions for tuning the jetty settings so
> we can use a stock fuseki. On 4.9.0 if we switch off direct buffer using
> in jetty altogether the problem does seem to go away. The performance
> hit we see is small and barely above noise.
>
> We currently have a soak test of leaving direct buffers on but limiting
> max and retained levels, that looks promising but too early to be sure.
>
> > I haven't managed to reproduce the situation on my machine in any sort
> > of predictable way where I can look at what's going on.
>
> Understood. While we can reproduce some effects in desktop test set ups
> the only real test has been to leave configurations running for days at
> a time in the real dev setting with all it's monitoring and
> instrumentation. Which makes testing any changes very painful, let alone
> deeper investigations.
>
> > For Jena5, there will be a switch to a Jetty to use uses jakarta.*
> > packages. That's no more than a rename of imports. The migration
> > EE8->EE9 is only repackaging.  That's Jetty10->Jetty11.
> >
> > There is now Jetty12. It is a major re-architecture of Jetty including
> > it's network handling for better HTTP/2 and HTTP/3.
> >
> > If there has been some behaviour of Jetty involved in the memory growth,
> > it is quite unlikely to carried over to Jetty12.
> >
> > Jetty12 is not a simple switch of artifacts for Fuseki. APIs have
> > changed but it's a step that going to be needed sometime.
> >
> > If it does not turn out that Fuseki needs a major re-architecture, I
> > think that Jena5 should be based on Jetty12. So far, it looks doable.
>
> Sound promising. Agreed that jetty12 is enough of a new build it's
> unlikely to have the same behaviour.
>
> We've being testing some of our troublesome queries on 4.9.0 on java 11
> vs java 17 and see a 10-15% performance hit on java 17 (even after we
> take control of the GC by forcing both to use the old parallel GC
> instead of G1). No idea why, seems wrong! Makes us inclined to stick
> with java 11 and thus jena 4.x series as long as we can.
>
> Dave
>
>


Re: Commercial Fuseki operational support

2023-08-25 Thread Martynas Jusevičius
Nicholas,

Fuseki is present on the AWS marketplace, if you follow your own link :)

The product is built using CDK, CloudFormation and Docker. If you’d be
interested to collaborate on that, send me an email off-list.

Martynas
atomgraph.com

On Fri, 25 Aug 2023 at 10.52, Nicholas Car  wrote:

> Hi Jena users,
>
> Just so everyone is aware: I have had several companies contact me about
> various levels of commercial Fuseki support.
>
> For users' interest, these are my observations of their Fuseki offerings:
>
> - many companies use Fuseki commercially as part of holistic Knowledge
> Graph architectures
> - no companies offer SLA-backed Fuseki support as a distinct service right
> now
>
> - other than mine! and this offer is new with SLAs still being worked out (
> https://kurrawong.ai/supported-products/fuseki/) so it is not very mature
> - several companies that use Fuseki are able to, and are interested in,
> offering distinct support if a serious request appears
> - lots of companies maintain Fuseki in Docker and similar container images
>
> - just search for Fuseki on Docker Hub and similar:
>
> - https://hub.docker.com/search?q=fuseki
> - https://github.com/search?q=fuseki=registrypackages
> - Fuseki is not present, in standalone form, the AWS or Azure marketplaces
> yet:
>
> - https://aws.amazon.com/marketplace/search/results?searchTerms=fuseki
> -
> https://azuremarketplace.microsoft.com/en-gb/marketplace/apps?search=fuseki=1
>
> If anything here is incorrect, please reply back to the mailing list to
> point out errors.
>
> Cheers, Nick
>
> --- Original Message ---
> On Wednesday, August 23rd, 2023 at 10:55, Nicholas Car 
> wrote:
>
> > Dear Jena users,
> >
> > Do any of you know companies that offer commercial support with Service
> Level Agreements for Fuseki?
> >
> > I have a government department here in Australia that wishes to run
> operational Fuseki instances with patching/config support for strong
> operations.
> >
> > Thanks, Nick


Re: Re: Mystery memory leak in fuseki

2023-07-21 Thread Martynas Jusevičius
There is one more variable in this picture: Java’s container awareness
https://developers.redhat.com/articles/2022/04/19/java-17-whats-new-openjdks-container-awareness

Whether it has an impact in this case, I have no idea :)

On Thu, 20 Jul 2023 at 11.48, Conal McLaughlin 
wrote:

> Hey Andy,
>
> Metaspace seems to be stable!
>
> We’re running this on Java 11 currently.
> I can check it out with Java17 though.
>
> We’ve currently set Xms/Xmx to 2560MB & MaxMetaspaceSize to 256MB.
>
> The ECS task is set with a ceiling of 4GB Memory & 1vcpu.
>
> Could it be more of a race condition than size of used objects, due to
> logging?
> I do see some time sensitive eviction code in Jetty -
> https://github.com/eclipse/jetty.project/blob/9e16d81cf8922c75e3d2d96c66442b896a9c69e1/jetty-io/src/main/java/org/eclipse/jetty/io/ArrayRetainableByteBufferPool.java#L374
> Not sure if the same type of thing exists in the Jena codebase.
>
> I will try to check with `—empty` also.
>
>
> Thanks,
> Conal
>
> On 2023/07/19 21:10:24 Andy Seaborne wrote:
> > Conal,
> >
> > Thanks for the information.
> > Can you see if metaspace is growing as well?
> >
> > All,
> >
> > Could someone please try running Fuseki main, with no datasets (--empty)
> > with some healthcheck ping traffic.
> >
> >  Andy
> >
> > On 19/07/2023 14:42, Conal McLaughlin wrote:
> > > Hey Dave,
> > >
> > > Thank you for providing an in depth analysis of your issues.
> > > We appear to be witnessing the same type of problems with our current
> > > Fuseki deployment.
> > > We are deploying a containerised Fuseki into a AWS ECS task alongside
> > > other containers - this may not be ideal but that’s a different story.
> > >
> > > I just wanted to add another data point to everything you have
> described.
> > > Firstly, it does seem like “idle” (or very low traffic) instances are
> > > the problem, for us (coupled with a larger heap than necessary).
> > > We witness the same increase in the ECS task memory consumption up
> until
> > > the whole thing is killed off. Which includes the Fuseki container.
> > >
> > > In an attempt to see what was going on beneath the hood, we turned up
> > > the logging to TRACE in the log4j2.xml file provided to Fuseki.
> > > This appeared to stabilise the increasing memory consumption.
> > > Even just switching the `logger.jetty.level` to TRACE alleviates the
> issue.
> >
> > Colour me confused!
> >
> > A Log4j logger that is active will use a few objects - may that's enough
> > to trigger a minor GC which in turn is enough to flush some non-heap
> > resources.
> >
> > How big is the heap?
> > This is Java17?
> >
> > > We are testing this on Fuseki 4.8.0/TDB2 with close to 0 triples and
> > > extremely low query traffic / health checks via /ping.
> > > KPk7uhH2F9Lp.png
> > > ecs-task-memory - Image on Pasteboard
> > > 
> > > pasteboard.co 
> > >
> > > 
> > >
> > >
> > > Cheers,
> > > Conal
> > >
> > > On 2023/07/11 09:31:25 Dave Reynolds wrote:
> > >  > Hi Rob,
> > >  >
> > >  > Good point. Will try to find time to experiment with that but given
> the
> > >  > testing cycle time that will take a while and can't start
> immediately.
> > >  >
> > >  > I'm a little sceptical though. As mentioned before, all the metrics
> we
> > >  > see show the direct memory pool that Jetty uses cycling up the max
> heap
> > >  > size and then being collected but with no long term growth to match
> the
> > >  > process size growth. This really feels more like a bug (though not
> sure
> > >  > where) than tuning. The fact that actual behaviour doesn't match the
> > >  > documentation isn't encouraging.
> > >  >
> > >  > It's also pretty hard to figure what the right pool configuration
> would
> > >  > be. This thing is just being asked to deliver a few metrics (12KB
> per
> > >  > request) several times a minute but manages to eat 500MB of direct
> > >  > buffer space every 5mins. So what the right pool parameters are to
> > >  > support real usage peaks is not going to be easy to figure out.
> > >  >
> > >  > None the less you are right. That's something that should be
> explored.
> > >  >
> > >  > Dave
> > >  >
> > >  >
> > >  > On 11/07/2023 09:45, Rob @ DNR wrote:
> > >  > > Dave
> > >  > >
> > >  > > Thanks for the further information.
> > >  > >
> > >  > > Have you experimented with using Jetty 10 but providing more
> > > detailed configuration?Fuseki supports providing detailed Jetty
> > > configuration if needed via the --jetty-config option
> > >  > >
> > >  > > The following section look relevant:
> > >  > >
> > >  > >
> > >
> https://eclipse.dev/jetty/documentation/jetty-10/operations-guide/index.html#og-module-bytebufferpool
> > >  > >
> > >  > > It looks like the default is that Jetty uses a heuristic to
> > > determine these values, sadly the heuristic in question is not
> detailed
> > > in that documentation.
> > >  > >
> 

Re: Dataset management API

2023-07-13 Thread Martynas Jusevičius
I realised the dataset management API is only available in
fuseki-webapp and not fuseki-main. That's unfortunate.

On Thu, Jul 13, 2023 at 10:09 PM Martynas Jusevičius
 wrote:
>
> Andy,
>
> Where are the dataset definitions created through the API persisted?
> Are they merged with the datasets defined in the config file, or how does it 
> work?
>
> Martynas
>
> On Sun, 2 Jul 2023 at 19.03, Andy Seaborne  wrote:
>>
>>
>>
>> On 02/07/2023 13:23, Martynas Jusevičius wrote:
>> > Hi,
>> >
>> > Can I see an example of the data that needs to be POSTed to /$/datasets in
>> > order to create a new dataset+service?
>> >
>> > The API is documented here but the data example is missing:
>> > https://jena.apache.org/documentation/fuseki2/fuseki-server-protocol.html#adding-a-dataset-and-its-services
>> >
>> > I hope it’s the same data that is used in the config file?
>>
>> the service part - or parameters dbType and dbname
>>
>> ActionDatasets.java
>>
>> > https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html#defining-the-service-name-and-endpoints-available
>> >
>> > Are there any practical limits on the number of datasets/services?
>>
>> No.
>>
>> Each database takes up some memory which is more than the management
>> information of a configuration.
>>
>>  Andy
>>
>> >
>> > Thanks.
>> >
>> > Martynas
>> >


Re: Dataset management API

2023-07-13 Thread Martynas Jusevičius
Andy,

Where are the dataset definitions created through the API persisted?
Are they merged with the datasets defined in the config file, or how
does it work?

Martynas

On Sun, 2 Jul 2023 at 19.03, Andy Seaborne  wrote:

>
>
> On 02/07/2023 13:23, Martynas Jusevičius wrote:
> > Hi,
> >
> > Can I see an example of the data that needs to be POSTed to /$/datasets
> in
> > order to create a new dataset+service?
> >
> > The API is documented here but the data example is missing:
> >
> https://jena.apache.org/documentation/fuseki2/fuseki-server-protocol.html#adding-a-dataset-and-its-services
> >
> > I hope it’s the same data that is used in the config file?
>
> the service part - or parameters dbType and dbname
>
> ActionDatasets.java
>
> >
> https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html#defining-the-service-name-and-endpoints-available
> >
> > Are there any practical limits on the number of datasets/services?
>
> No.
>
> Each database takes up some memory which is more than the management
> information of a configuration.
>
>  Andy
>
> >
> > Thanks.
> >
> > Martynas
> >
>


Re: Mystery memory leak in fuseki

2023-07-04 Thread Martynas Jusevičius
You can profile it in the container as well :)
https://github.com/AtomGraph/fuseki-docker#profiling

On Tue, 4 Jul 2023 at 11.12, Rob @ DNR  wrote:

> Does this only happen in a container?  Or can you reproduce it running
> locally as well?
>
> If you can reproduce it locally then attaching a profiler like VisualVM so
> you can take a heap snapshot and see where the memory is going that would
> be useful
>
> Rob
>
> From: Dave Reynolds 
> Date: Tuesday, 4 July 2023 at 09:31
> To: users@jena.apache.org 
> Subject: Re: Mystery memory leak in fuseki
> Tried 4.7.0 under most up to date java 17 and it acts like 4.8.0. After
> 16hours it gets to about 1.6GB and by eye has nearly flatted off
> somewhat but not completely.
>
> For interest here's a MEM% curve on a 4GB box (hope the link works).
>
> https://www.dropbox.com/s/xjmluk4o3wlwo0y/fuseki-mem-percent.png?dl=0
>
> The flattish curve from 12:00 to 17:20 is a run using 3.16.0 for
> comparison. The curve from then onwards is 4.7.0.
>
> The spikes on the 4.7.0 match the allocation and recovery of the direct
> memory buffers. The JVM metrics show those cycling around every 10mins
> and being reclaimed each time with no leaking visible at that level.
> Heap, non-heap and mapped buffers are all basically unchanging which is
> to be expected since it's doing nothing apart from reporting metrics.
>
> Whereas this curve (again from 17:20 onwards) shows basically the same
> 4.7.0 set up on a separate host, showing that despite flattening out
> somewhat usage continues to grow - a least on a 16 hour timescale.
>
> https://www.dropbox.com/s/k0v54yq4kexklk0/fuseki-mem-percent-2.png?dl=0
>
>
> Both of those runs were using Eclipse Temurin on a base Ubuntu jammy
> container. Pervious runs used AWS Corretto on an AL2 base container.
> Behaviour basically unchanged so eliminates this being some
> Corretto-specific issue or a weird base container OS issue.
>
> Dave
>
> On 03/07/2023 14:54, Andy Seaborne wrote:
> > Hi Dave,
> >
> > Could you try 4.7.0?
> >
> > 4.6.0 was 2022-08-20
> > 4.7.0 was 2022-12-27
> > 4.8.0 was 2023-04-20
> >
> > This is an in-memory database?
> >
> > Micrometer/Prometheus has had several upgrades but if it is not heap and
> > not direct memory (I though that was a hard bound set at start up), I
> > don't see how it can be involved.
> >
> >  Andy
> >
> > On 03/07/2023 14:20, Dave Reynolds wrote:
> >> We have a very strange problem with recent fuseki versions when
> >> running (in docker containers) on small machines. Suspect a jetty
> >> issue but it's not clear.
> >>
> >> Wondering if anyone has seen anything like this.
> >>
> >> This is a production service but with tiny data (~250k triples, ~60MB
> >> as NQuads). Runs on 4GB machines with java heap allocation of 500MB[1].
> >>
> >> We used to run using 3.16 on jdk 8 (AWS Corretto for the long term
> >> support) with no problems.
> >>
> >> Switching to fuseki 4.8.0 on jdk 11 the process grows in the space of
> >> a day or so to reach ~3GB of memory at which point the 4GB machine
> >> becomes unviable and things get OOM killed.
> >>
> >> The strange thing is that this growth happens when the system is
> >> answering no Sparql queries at all, just regular health ping checks
> >> and (prometheus) metrics scrapes from the monitoring systems.
> >>
> >> Furthermore the space being consumed is not visible to any of the JVM
> >> metrics:
> >> - Heap and and non-heap are stable at around 100MB total (mostly
> >> non-heap metaspace).
> >> - Mapped buffers stay at 50MB and remain long term stable.
> >> - Direct memory buffers being allocated up to around 500MB then being
> >> reclaimed. Since there are no sparql queries at all we assume this is
> >> jetty NIO buffers being churned as a result of the metric scrapes.
> >> However, this direct buffer behaviour seems stable, it cycles between
> >> 0 and 500MB on approx a 10min cycle but is stable over a period of
> >> days and shows no leaks.
> >>
> >> Yet the java process grows from an initial 100MB to at least 3GB. This
> >> can occur in the space of a couple of hours or can take up to a day or
> >> two with no predictability in how fast.
> >>
> >> Presumably there is some low level JNI space allocated by Jetty (?)
> >> which is invisible to all the JVM metrics and is not being reliably
> >> reclaimed.
> >>
> >> Trying 4.6.0, which we've had less problems with elsewhere, that seems
> >> to grow to around 1GB (plus up to 0.5GB for the cycling direct memory
> >> buffers) and then stays stable (at least on a three day soak test).
> >> We could live with allocating 1.5GB to a system that should only need
> >> a few 100MB but concerned that it may not be stable in the really long
> >> term and, in any case, would rather be able to update to more recent
> >> fuseki versions.
> >>
> >> Trying 4.8.0 on java 17 it grows rapidly to around 1GB again but then
> >> keeps ticking up slowly at random intervals. We project that it would
> >> take a few weeks to grow the 

Re: Mystery memory leak in fuseki

2023-07-03 Thread Martynas Jusevičius
There have been a few similar threads:

https://www.mail-archive.com/users@jena.apache.org/msg19871.html

https://www.mail-archive.com/users@jena.apache.org/msg18825.html

On Mon, 3 Jul 2023 at 15.20, Dave Reynolds 
wrote:

> We have a very strange problem with recent fuseki versions when running
> (in docker containers) on small machines. Suspect a jetty issue but it's
> not clear.
>
> Wondering if anyone has seen anything like this.
>
> This is a production service but with tiny data (~250k triples, ~60MB as
> NQuads). Runs on 4GB machines with java heap allocation of 500MB[1].
>
> We used to run using 3.16 on jdk 8 (AWS Corretto for the long term
> support) with no problems.
>
> Switching to fuseki 4.8.0 on jdk 11 the process grows in the space of a
> day or so to reach ~3GB of memory at which point the 4GB machine becomes
> unviable and things get OOM killed.
>
> The strange thing is that this growth happens when the system is
> answering no Sparql queries at all, just regular health ping checks and
> (prometheus) metrics scrapes from the monitoring systems.
>
> Furthermore the space being consumed is not visible to any of the JVM
> metrics:
> - Heap and and non-heap are stable at around 100MB total (mostly
> non-heap metaspace).
> - Mapped buffers stay at 50MB and remain long term stable.
> - Direct memory buffers being allocated up to around 500MB then being
> reclaimed. Since there are no sparql queries at all we assume this is
> jetty NIO buffers being churned as a result of the metric scrapes.
> However, this direct buffer behaviour seems stable, it cycles between 0
> and 500MB on approx a 10min cycle but is stable over a period of days
> and shows no leaks.
>
> Yet the java process grows from an initial 100MB to at least 3GB. This
> can occur in the space of a couple of hours or can take up to a day or
> two with no predictability in how fast.
>
> Presumably there is some low level JNI space allocated by Jetty (?)
> which is invisible to all the JVM metrics and is not being reliably
> reclaimed.
>
> Trying 4.6.0, which we've had less problems with elsewhere, that seems
> to grow to around 1GB (plus up to 0.5GB for the cycling direct memory
> buffers) and then stays stable (at least on a three day soak test).  We
> could live with allocating 1.5GB to a system that should only need a few
> 100MB but concerned that it may not be stable in the really long term
> and, in any case, would rather be able to update to more recent fuseki
> versions.
>
> Trying 4.8.0 on java 17 it grows rapidly to around 1GB again but then
> keeps ticking up slowly at random intervals. We project that it would
> take a few weeks to grow the scale it did under java 11 but it will
> still eventually kill the machine.
>
> Anyone seem anything remotely like this?
>
> Dave
>
> [1]  500M heap may be overkill but there can be some complex queries and
> that should still leave plenty of space for OS buffers etc in the
> remaining memory on a 4GB machine.
>
>
>
>


Dataset management API

2023-07-02 Thread Martynas Jusevičius
Hi,

Can I see an example of the data that needs to be POSTed to /$/datasets in
order to create a new dataset+service?

The API is documented here but the data example is missing:
https://jena.apache.org/documentation/fuseki2/fuseki-server-protocol.html#adding-a-dataset-and-its-services

I hope it’s the same data that is used in the config file?
https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html#defining-the-service-name-and-endpoints-available

Are there any practical limits on the number of datasets/services?

Thanks.

Martynas


Re: CONSTRUCT query with rsparql

2023-06-14 Thread Martynas Jusevičius
Sorry, I meant this:



CONSTRUCT
  {
?s ?p ?o .
  }
WHERE
  { ?s  ?p?o ;
<http://purl.org/dc/terms/title>  "the message"
  }

On Wed, Jun 14, 2023 at 12:23 PM Martynas Jusevičius
 wrote:
>
> No syntax issue, but with ?o being not bound the ?s ?p ?o triple will
> not be constructed.
>
> This might or might not be what you are looking for:
>
> CONSTRUCT
>   {
> ?s ?p ?o .
>   }
> WHERE
>   { ?s  <http://purl.org/dc/terms/title>  ?o ;
> <http://purl.org/dc/terms/title>  "the message"
>   }
>
> On Wed, Jun 14, 2023 at 12:13 PM Hashim Khan  
> wrote:
> >
> > Thanks for the quick response. I would ask if you write the exact query
> > please. By, ?o not bound, you mean that we can not execute such a
> > construct query? Or is there a syntax issue?
> >
> > On Wed, Jun 14, 2023 at 12:04 PM Martynas Jusevičius 
> > 
> > wrote:
> >
> > > ?o is not bound in your WHERE pattern.
> > >
> > > On Wed, Jun 14, 2023 at 11:50 AM Hashim Khan 
> > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > ./rsparql --service http://localhost:8890/sparql --query query.sparql
> > > > --results=ntriples I want to save the triples in NT format as returned 
> > > > by
> > > > the construct query given here.
> > > > CONSTRUCT {?s ?p ?o} WHERE {
> > > >?s <http://purl.org/dc/terms/title> 'the message' .
> > > > }
> > > >
> > > > I have run this query with SELECT and it has many triples, but when I
> > > want
> > > > to construct them, then I can not see the effect. It takes about two
> > > > seconds but neither output.nt file generates, nor I see any triples. If
> > > > anyone please tell me the exact query to accomplish this job.
> > > >
> > > > best regards
> > > > --
> > > > *Hashim Khan*
> > >
> >
> >
> > --
> > *Hashim Khan*


Re: CONSTRUCT query with rsparql

2023-06-14 Thread Martynas Jusevičius
No syntax issue, but with ?o being not bound the ?s ?p ?o triple will
not be constructed.

This might or might not be what you are looking for:

CONSTRUCT
  {
?s ?p ?o .
  }
WHERE
  { ?s  <http://purl.org/dc/terms/title>  ?o ;
<http://purl.org/dc/terms/title>  "the message"
  }

On Wed, Jun 14, 2023 at 12:13 PM Hashim Khan  wrote:
>
> Thanks for the quick response. I would ask if you write the exact query
> please. By, ?o not bound, you mean that we can not execute such a
> construct query? Or is there a syntax issue?
>
> On Wed, Jun 14, 2023 at 12:04 PM Martynas Jusevičius 
> wrote:
>
> > ?o is not bound in your WHERE pattern.
> >
> > On Wed, Jun 14, 2023 at 11:50 AM Hashim Khan 
> > wrote:
> > >
> > > Hi,
> > >
> > > ./rsparql --service http://localhost:8890/sparql --query query.sparql
> > > --results=ntriples I want to save the triples in NT format as returned by
> > > the construct query given here.
> > > CONSTRUCT {?s ?p ?o} WHERE {
> > >?s <http://purl.org/dc/terms/title> 'the message' .
> > > }
> > >
> > > I have run this query with SELECT and it has many triples, but when I
> > want
> > > to construct them, then I can not see the effect. It takes about two
> > > seconds but neither output.nt file generates, nor I see any triples. If
> > > anyone please tell me the exact query to accomplish this job.
> > >
> > > best regards
> > > --
> > > *Hashim Khan*
> >
>
>
> --
> *Hashim Khan*


Re: CONSTRUCT query with rsparql

2023-06-14 Thread Martynas Jusevičius
?o is not bound in your WHERE pattern.

On Wed, Jun 14, 2023 at 11:50 AM Hashim Khan  wrote:
>
> Hi,
>
> ./rsparql --service http://localhost:8890/sparql --query query.sparql
> --results=ntriples I want to save the triples in NT format as returned by
> the construct query given here.
> CONSTRUCT {?s ?p ?o} WHERE {
>?s  'the message' .
> }
>
> I have run this query with SELECT and it has many triples, but when I want
> to construct them, then I can not see the effect. It takes about two
> seconds but neither output.nt file generates, nor I see any triples. If
> anyone please tell me the exact query to accomplish this job.
>
> best regards
> --
> *Hashim Khan*


Re: Executing SHACL over Fuseki dataset

2023-06-06 Thread Martynas Jusevičius
Thanks Øyvind, that looks promising.

On Tue, Jun 6, 2023 at 11:58 AM Øyvind Gjesdal  wrote:
>
> I haven't tried it, but it looks like it is implemented, and you can
> configure a SHACL service on datasets, in the assembler for Fuseki.
> https://jena.apache.org/documentation/shacl/#integration-with-apache-jena-fuseki
>
>
> You can then use the api and post shapes as files for validating datasets:
>
> The example from the documentation :
>
> curl -XPOST --data-binary @fu-shapes.ttl  \
>  --header 'Content-type: text/turtle' \
>  'http://localhost:3030/ds/shacl?graph=default'
>
> Best regards,
> Øyvind
>
>
> On Tue, Jun 6, 2023 at 11:27 AM Martynas Jusevičius 
> wrote:
>
> > Hi,
> >
> > What is the approach to validating data stored in Fuseki with SHACL?
> > Without having to retrieve a data dump first.
> >
> > I have found some generic projects that claim to translate SHACL to SPARQL:
> > https://github.com/rdfshapes/shacl-sparql
> > https://github.com/Shape-Fragments/SHACL2SPARQL
> > Has anyone had any luck with them?
> >
> > I was wondering if a Jena-based solution would be feasible and what
> > would it take.
> >
> >
> > Martynas
> >


Executing SHACL over Fuseki dataset

2023-06-06 Thread Martynas Jusevičius
Hi,

What is the approach to validating data stored in Fuseki with SHACL?
Without having to retrieve a data dump first.

I have found some generic projects that claim to translate SHACL to SPARQL:
https://github.com/rdfshapes/shacl-sparql
https://github.com/Shape-Fragments/SHACL2SPARQL
Has anyone had any luck with them?

I was wondering if a Jena-based solution would be feasible and what
would it take.


Martynas


Re: XSD date functions broken?

2023-04-18 Thread Martynas Jusevičius
Neither 1649185973 nor “1649185973” is a valid xsd:date value. Isn’t that
the issue?

On Tue, 18 Apr 2023 at 14.00, Enrico.Daga 
wrote:

> Hi Simon,
>
> > The question is what the function should do??
>
> Convert a timestamp into a date time format (and then later into a
> readable string).
>
> Thanks for pointing out the casting issue; however, I tried passing it as
> a string and it does not work either.
>
> What would you recommend?
>
> Thanks!
>
> Enrico
>
>
> --
> Enrico Daga, PhD
>
> www.enridaga.net | @enridaga
>
> SPARQL Anything http://sparql-anything.cc
> Polifonia http://polifonia-project.eu
> SPICE http://spice-h2020.eu
> Open Knowledge Graph http://data.open.ac.uk
>
> Senior Research Fellow, Knowledge Media Institute, STEM Faculty
> The Open University
> Level 4 Berrill Building, Walton Hall, Milton Keynes, MK7 6AA
> Direct: +44 (0) 1908 654887
> 
> From: Simon Bin 
> Sent: 18 April 2023 11:43
> To: users@jena.apache.org 
> Subject: Re: XSD date functions broken?
>
> The question is what the function should do??
>
> if you look here:
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FTR%2Fsparql11-query%2F%23FunctionMapping=05%7C01%7Cenrico.daga%40open.ac.uk%7C8afbac578cf644e8ba0808db3ff9c74a%7C0e2ed45596af4100bed3a8e5fd981685%7C0%7C0%7C638174114243585920%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=DEaRRBUhUobpuseI8EwQGbA1ABo7M9XRj9taVmPpj6M%3D=0
> 
>
> it is "N"ot allowed to cast from int to dateTime
>
> On Tue, 2023-04-18 at 10:19 +, Enrico.Daga wrote:
> > Hi,
> >
> > I need help using XSD date/time functions, I tried versions 4.2.0 and
> > 4.7.0 and both don't seem to work.
> >
> > Considering this Java code:
> >
> >
> > Dataset kb = DatasetFactory.createGeneral();
> > Query q = QueryFactory.create(q);
> > result = QueryExecutionFactory.create(q, kb).execSelect();
> >
> > The following throws an NPE (no results)
> >
> >
> > String q = "\n" +
> > "PREFIX xsd: <
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23=05%7C01%7Cenrico.daga%40open.ac.uk%7C8afbac578cf644e8ba0808db3ff9c74a%7C0e2ed45596af4100bed3a8e5fd981685%7C0%7C0%7C638174114243585920%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=3%2BRsVgMKF8vR2T5a6SkwfKCYLDpn3LVgj4iH2vX%2Fv64%3D=0
> >" +
> > "SELECT ?date WHERE { BIND(xsd:dateTime (1649185973) AS ?date
> > ) }";
> >
> > ...
> > System.err.println(result.next().get("date").toString());
> >
> > While the cast to int works fine:
> >
> >
> > String q = "\n" +
> > "PREFIX xsd: <
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23=05%7C01%7Cenrico.daga%40open.ac.uk%7C8afbac578cf644e8ba0808db3ff9c74a%7C0e2ed45596af4100bed3a8e5fd981685%7C0%7C0%7C638174114243585920%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=3%2BRsVgMKF8vR2T5a6SkwfKCYLDpn3LVgj4iH2vX%2Fv64%3D=0
> >" +
> > "SELECT ?date WHERE { BIND(xsd:int (1649185973) AS ?date )
> > }";
> > System.err.println(executeARQ(q).next().get("date").toString());
> >
> > 1649185973^^
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23int=05%7C01%7Cenrico.daga%40open.ac.uk%7C8afbac578cf644e8ba0808db3ff9c74a%7C0e2ed45596af4100bed3a8e5fd981685%7C0%7C0%7C638174114243585920%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=82aUfvHAyPRbCeKNeJEQ0QQtiNKNKnr3N3Afw8w0v70%3D=0
> >
> > Am I missing anything?
> >
> > Best,
> >
> > Enrico
> >
> > --
> > Enrico Daga, PhD
> >
> >
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.enridaga.net%2F=05%7C01%7Cenrico.daga%40open.ac.uk%7C8afbac578cf644e8ba0808db3ff9c74a%7C0e2ed45596af4100bed3a8e5fd981685%7C0%7C0%7C638174114243585920%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=2l7urXsu9EwIB59M3fA1epW9mkCZfMc19gAaZ%2FLC%2BJE%3D=0
> <
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.enridaga.net%2F=05%7C01%7Cenrico.daga%40open.ac.uk%7C8afbac578cf644e8ba0808db3ff9c74a%7C0e2ed45596af4100bed3a8e5fd981685%7C0%7C0%7C638174114243585920%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=2l7urXsu9EwIB59M3fA1epW9mkCZfMc19gAaZ%2FLC%2BJE%3D=0
> > | @enridaga
> >
> > SPARQL Anything
> 

RDF-syntax-check GH action

2023-02-28 Thread Martynas Jusevičius
Hi,

I made a riot-based RDF syntax checking GH action:
https://github.com/AtomGraph/RDF-syntax-check

Maybe it can be of use :)


Martynas


Re: [SHACL] sh:prefixes

2023-02-28 Thread Martynas Jusevičius
Thanks Holger, that is the case :) The "sh:prefixes ex:" in the SHACL spec
was somewhat confusing without context, because it looks like a Turtle
prefix declaration.

This works:

:prefixes sh:declare [
sh:prefix "skos" ;
sh:namespace "http://www.w3.org/2004/02/skos/core#"^^xsd:anyURI ;
] .

<#ConceptBroaderCycleShape> sh:prefixes :prefixes ;


On Tue, Feb 28, 2023 at 2:16 PM Holger Knublauch 
wrote:

> I suspect this is another instance of the common misunderstanding: the
> @prefix declarations are not mapped to any triples and are only a concept
> of the serialization. To make them visible to SHACL, you need to declare
> triples such as in
>
> Shapes Constraint Language (SHACL)
> <https://www.w3.org/TR/shacl/#sparql-prefixes>
> w3.org <https://www.w3.org/TR/shacl/#sparql-prefixes>
> [image: favicon.ico] <https://www.w3.org/TR/shacl/#sparql-prefixes>
> <https://www.w3.org/TR/shacl/#sparql-prefixes>
>
> in your case it would be skos: as subject.
>
> Holger
>
>
> On 28 Feb 2023, at 12:56 pm, Martynas Jusevičius 
> wrote:
>
> Hi,
>
> Does Jena's SHACL engine support sh:prefixes? Or am I using them wrong?
>
> The following test shape
>
> @prefix skos:   <http://www.w3.org/2004/02/skos/core#> .
> @prefix sh: <http://www.w3.org/ns/shacl#> .
>
> <#ConceptBroaderCycleShape>
>a sh:NodeShape ;
>sh:targetClass skos:Concept ;
>sh:sparql [
>a sh:SPARQLConstraint ;
>sh:message "Concept is broader than itself (directly or
> indirectly)" ;
>sh:prefixes skos: ;
>sh:select """
>SELECT *
>{
>$this skos:broader+ $this .
>}
>""" ;
>] .
>
> returns an error:
>
> org.apache.jena.shacl.parser.ShaclParseException: Bad query: Line 5,
> column 23: Unresolved prefixed name: skos:broader
> at org.apache.jena.shacl.lib.ShLib.parseQueryString(ShLib.java:262)
> at org.apache.jena.shacl.lib.ShLib.extractSPARQLQuery(ShLib.java:270)
> at
> org.apache.jena.shacl.engine.SparqlConstraints.parseSparqlConstraint(SparqlConstraints.java:64)
> at
> org.apache.jena.shacl.parser.Constraints.lambda$static$20(Constraints.java:116)
> at
> org.apache.jena.shacl.parser.Constraints.parseConstraint(Constraints.java:176)
> at
> org.apache.jena.shacl.parser.Constraints.parseConstraints(Constraints.java:160)
> at
> org.apache.jena.shacl.parser.ShapesParser.parseShape$(ShapesParser.java:319)
> at
> org.apache.jena.shacl.parser.ShapesParser.parseShapeStep(ShapesParser.java:305)
> at
> org.apache.jena.shacl.parser.ShapesParser.parseShape(ShapesParser.java:236)
> at
> org.apache.jena.shacl.parser.ShapesParser.parseShapeAcc(ShapesParser.java:221)
> at
> org.apache.jena.shacl.parser.ShapesParser.parseShapes(ShapesParser.java:163)
> at
> org.apache.jena.shacl.parser.ShapesParser.parseProcess(ShapesParser.java:100)
> at org.apache.jena.shacl.Shapes.parseProcess(Shapes.java:111)
> at org.apache.jena.shacl.Shapes.parseAll(Shapes.java:106)
> at org.apache.jena.shacl.Shapes.parse(Shapes.java:83)
> at
> org.apache.jena.shacl.validation.ShaclPlainValidator.parse(ShaclPlainValidator.java:38)
> at
> org.apache.jena.shacl.validation.ShaclPlainValidator.validate(ShaclPlainValidator.java:90)
> at shacl.shacl_validate.exec(shacl_validate.java:124)
> at org.apache.jena.cmd.CmdMain.mainMethod(CmdMain.java:87)
> at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:56)
> at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:43)
> at shacl.shacl_validate.main(shacl_validate.java:60)
> at shacl.shacl.main(shacl.java:81)
>
>
> Martynas
>
>
>


[SHACL] sh:prefixes

2023-02-28 Thread Martynas Jusevičius
Hi,

Does Jena's SHACL engine support sh:prefixes? Or am I using them wrong?

The following test shape

@prefix skos:    .
@prefix sh:  .

<#ConceptBroaderCycleShape>
a sh:NodeShape ;
sh:targetClass skos:Concept ;
sh:sparql [
a sh:SPARQLConstraint ;
sh:message "Concept is broader than itself (directly or indirectly)" ;
sh:prefixes skos: ;
sh:select """
SELECT *
{
$this skos:broader+ $this .
}
""" ;
] .

returns an error:

org.apache.jena.shacl.parser.ShaclParseException: Bad query: Line 5,
column 23: Unresolved prefixed name: skos:broader
at org.apache.jena.shacl.lib.ShLib.parseQueryString(ShLib.java:262)
at org.apache.jena.shacl.lib.ShLib.extractSPARQLQuery(ShLib.java:270)
at 
org.apache.jena.shacl.engine.SparqlConstraints.parseSparqlConstraint(SparqlConstraints.java:64)
at 
org.apache.jena.shacl.parser.Constraints.lambda$static$20(Constraints.java:116)
at 
org.apache.jena.shacl.parser.Constraints.parseConstraint(Constraints.java:176)
at 
org.apache.jena.shacl.parser.Constraints.parseConstraints(Constraints.java:160)
at org.apache.jena.shacl.parser.ShapesParser.parseShape$(ShapesParser.java:319)
at 
org.apache.jena.shacl.parser.ShapesParser.parseShapeStep(ShapesParser.java:305)
at org.apache.jena.shacl.parser.ShapesParser.parseShape(ShapesParser.java:236)
at 
org.apache.jena.shacl.parser.ShapesParser.parseShapeAcc(ShapesParser.java:221)
at org.apache.jena.shacl.parser.ShapesParser.parseShapes(ShapesParser.java:163)
at org.apache.jena.shacl.parser.ShapesParser.parseProcess(ShapesParser.java:100)
at org.apache.jena.shacl.Shapes.parseProcess(Shapes.java:111)
at org.apache.jena.shacl.Shapes.parseAll(Shapes.java:106)
at org.apache.jena.shacl.Shapes.parse(Shapes.java:83)
at 
org.apache.jena.shacl.validation.ShaclPlainValidator.parse(ShaclPlainValidator.java:38)
at 
org.apache.jena.shacl.validation.ShaclPlainValidator.validate(ShaclPlainValidator.java:90)
at shacl.shacl_validate.exec(shacl_validate.java:124)
at org.apache.jena.cmd.CmdMain.mainMethod(CmdMain.java:87)
at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:56)
at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:43)
at shacl.shacl_validate.main(shacl_validate.java:60)
at shacl.shacl.main(shacl.java:81)


Martynas


Re: SHACLC and RDFLanguages

2022-12-15 Thread Martynas Jusevičius
I was looking into this again... I see that RDFParserRegistry already
has the required collections langTriples and langQuads.

I think simply adding accessors for them would solve my use case.
Something like this maybe?

public static Set registeredLangTriples()
{
return Collections.unmodifiableSet(langTriples);
}

public static Set registeredLangQuads()
{
return Collections.unmodifiableSet(langQuads);
}


I don't see a similar collection under ResultSetReaderRegistry though?

On Mon, May 23, 2022 at 6:04 PM Andy Seaborne  wrote:
>
> PRs welcome.
>
> On 23/05/2022 08:51, Martynas Jusevičius wrote:
> > Could RDFParserRegistry::getRegistered and
> > ResultSetReaderRegistry::getRegistered be added?
>
> Consistent naming would be registered().
>
>  Andy
>
> > On Fri, May 20, 2022 at 9:01 PM Andy Seaborne  wrote:
> >>
> >>
> >>
> >> On 20/05/2022 14:05, Martynas Jusevičius wrote:
> >>> Andy, is that correct?
> >>
> >> Yes
> >>
> >>   Andy
> >>
> >>>
> >>> On Tue, May 17, 2022 at 1:33 PM Martynas Jusevičius
> >>>  wrote:
> >>>>
> >>>> On Tue, May 17, 2022 at 1:19 PM Andy Seaborne  wrote:
> >>>>>
> >>>>> RDFLanguages is a general registry of names (Lang's) in the system.
> >>>>>
> >>>>> It is not for functionality.
> >>>>>
> >>>>> RDFParserRegistry
> >>>>> RDFWriterRegistry
> >>>>> RowSetReaderRegistry, ResultSetReaderRegistry
> >>>>> RowSetWriterRegistry, ResultSetWriterRegistry
> >>>>> StreamRDFWriter
> >>>>>
> >>>>> A Lang needs looking up in a registry to see if there is support for it.
> >>>>
> >>>> Thanks, I didn't know these existed.
> >>>>
> >>>> But there are no RDFParserRegistry::getRegistered or
> >>>> ResultSetReaderRegistry::getRegistered methods?
> >>>>
> >>>> So do I still need to iterate RDFLanguages::getRegistered and check
> >>>> each Lang against
> >>>> RDFParserRegistry::isRegistered/ResultSetReaderRegistry::isRegistered?
> >>>>
> >>>>>
> >>>>>Andy
> >>>>>
> >>>>> On 17/05/2022 09:54, Martynas Jusevičius wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> After upgrading from 4.3.2 to 4.5.0, some of our RDF writing code
> >>>>>> started failing.
> >>>>>>
> >>>>>> It seems that this is due to RDFLanguages.isTriples(Lang.SHACLC)
> >>>>>> returning true, which messes up our content negotiation as it attempts
> >>>>>> to write Models as SHACLC. Can this be rectified?
> >>>>>>
> >>>>>> The RDFLanguages registry is a bit of an oxymoron in general. Right
> >>>>>> now it's a bag of all sorts of syntaxes Jena supports, half of which
> >>>>>> are not even "RDF languages". We need to iterate and filter the
> >>>>>> languages just to know which ones can be used to read/write Models,
> >>>>>> which can be used for ResultSets etc.:
> >>>>>> https://github.com/AtomGraph/Core/blob/master/src/main/java/com/atomgraph/core/MediaTypes.java#L86
> >>>>>> Wouldn't it make sense to have separate registries depending on the
> >>>>>> entity types they apply to?
> >>>>>>
> >>>>>> Thanks.
> >>>>>>
> >>>>>> Martynas


Re: Replacement for CSVInput and TSVInput?

2022-12-15 Thread Martynas Jusevičius
Thanks.

On Mon, Dec 12, 2022 at 7:55 PM Andy Seaborne  wrote:
>
> ResultsSetMgr which uses ResultsReader
>
>
> On 12/12/2022 15:45, Martynas Jusevičius wrote:
> > Hi,
> >
> > I'm upgrading Jena 4.5.0 to 4.6.1.
> >
> > I can see that org.apache.jena.sparql.resultset.CSVInput is gone and
> > org.apache.jena.sparql.resultset.TSVInput is deprecated.
> >
> > What are the replacements for them?
> >
> > My code was the following:
> >
> >  if 
> > (mediaType.isCompatible(com.atomgraph.core.MediaType.APPLICATION_SPARQL_RESULTS_XML_TYPE))
> >  return
> > ResultSetFactory.makeRewindable(ResultSetFactory.fromXML(in));
> >  if 
> > (mediaType.isCompatible(com.atomgraph.core.MediaType.APPLICATION_SPARQL_RESULTS_JSON_TYPE))
> >  return
> > ResultSetFactory.makeRewindable(ResultSetFactory.fromJSON(in));
> >  if 
> > (mediaType.isCompatible(com.atomgraph.core.MediaType.APPLICATION_SPARQL_RESULTS_CSV_TYPE))
> >  return ResultSetFactory.makeRewindable(CSVInput.fromCSV(in));
> >  if 
> > (mediaType.isCompatible(com.atomgraph.core.MediaType.APPLICATION_SPARQL_RESULTS_CSV_TYPE))
> >  return ResultSetFactory.makeRewindable(TSVInput.fromTSV(in));
> >
> > Thanks.
> >
> > Martynas
> > atomgraph.com


Replacement for org.apache.jena.rdfxml.xmloutput.impl.Basic

2022-12-15 Thread Martynas Jusevičius
Hi,

org.apache.jena.rdfxml.xmloutput.impl.Basic is gone in 4.6.1.

How do I rewrite the following code then?

RDFWriterI writer = new Basic();
writer.setProperty("allowBadURIs", true); // round-tripping
RDF/POST with user input may contain invalid URIs
   writer.write(model, baos, null);

Thanks.


Replacement for CSVInput and TSVInput?

2022-12-12 Thread Martynas Jusevičius
Hi,

I'm upgrading Jena 4.5.0 to 4.6.1.

I can see that org.apache.jena.sparql.resultset.CSVInput is gone and
org.apache.jena.sparql.resultset.TSVInput is deprecated.

What are the replacements for them?

My code was the following:

if 
(mediaType.isCompatible(com.atomgraph.core.MediaType.APPLICATION_SPARQL_RESULTS_XML_TYPE))
return
ResultSetFactory.makeRewindable(ResultSetFactory.fromXML(in));
if 
(mediaType.isCompatible(com.atomgraph.core.MediaType.APPLICATION_SPARQL_RESULTS_JSON_TYPE))
return
ResultSetFactory.makeRewindable(ResultSetFactory.fromJSON(in));
if 
(mediaType.isCompatible(com.atomgraph.core.MediaType.APPLICATION_SPARQL_RESULTS_CSV_TYPE))
return ResultSetFactory.makeRewindable(CSVInput.fromCSV(in));
if 
(mediaType.isCompatible(com.atomgraph.core.MediaType.APPLICATION_SPARQL_RESULTS_CSV_TYPE))
return ResultSetFactory.makeRewindable(TSVInput.fromTSV(in));

Thanks.

Martynas
atomgraph.com


Re: How to Use Fuseki Backup API

2022-11-27 Thread Martynas Jusevičius
On Sun, Nov 27, 2022 at 7:20 PM Bruno Kinoshita  wrote:
>
> Ah, at the top of this page:
> https://jena.apache.org/documentation/fuseki2/fuseki-main#fuseki-docker
>
> It says: "Fuseki main is a packaging of Fuseki as a triple store without a
> UI for administration."

The coupling between the UI and the administration is pretty weird TBH.

Isn't it perfectly normal to invoke the administration features (which
are just HTTP endpoints anyway) without using the UI?

> And further down: "The main server does not depend
> on any files on disk (other than for databases provided by the
> application), and does not provide the Fuseki UI or admins functions to
> create dataset via HTTP.". I had forgotten about that change.
>
> So I believe you are right Tim, what you must have in your container is
> Fuseki main without the UI, so without the backup servlet & endpoint
> binding (thus the 404). You can have a look at the page about Fuseki + UI
> for options for running it separately with access to admin features:
> https://jena.apache.org/documentation/fuseki2/fuseki-webapp.html
>
> -Bruno
>
> On Sun, 27 Nov 2022 at 19:12, Bruno Kinoshita 
> wrote:
>
> > I got the same result following the docs for the Docker compose
> > installation:
> > https://jena.apache.org/documentation/fuseki2/fuseki-main#fuseki-docker
> >
> > Adding --update didn't solve it. So there might be something that needs to
> > be enabled in the dataset assembler configuration when you create the
> > dataset in the container.
> >
> > On Sun, 27 Nov 2022 at 18:56, Tim McIver  wrote:
> >
> >> It's not working for me.  I even tried doing it from the fuseki
> >> container.  It seems this image does not have curl so I tried wget using
> >> 'wget http://localhost:3030/$/backup/ds --post-data ""'. Again, I get a
> >> 404.
> >>
> >>
> >> Do the admin endpoints have to be specifically enabled?  Or could they
> >> have been disabled?
> >>
> >> Tim
> >>
> >> On 11/27/22 12:07, Bruno Kinoshita wrote:
> >> > Hi Tim,
> >> >
> >> > I am not using a container, but I just tested the latest version from
> >> Git
> >> > on Eclipse, and tested the endpoints with curl to query and backup.
> >> Maybe
> >> > your endpoint URL is missing something?
> >> >
> >> > 1. Create ds in-memory dataset
> >> > 2. Load some dummy data
> >> > 3. curl a query: $ curl 'http://localhost:3030/ds/' -X POST --data-raw
> >> > 'query=...' (success, data returned as expected)
> >> > 4. curl to trigger a backup: $ curl 'http://localhost:3030/$/backup/ds'
> >> -X
> >> > POST
> >> >
> >> > Then, if you want, you can also query for the tasks (a back up creates
> >> an
> >> > async task on the server):
> >> >
> >> > $ curl http://localhost:3030/$/tasks
> >> > [ {
> >> > "task" : "Backup" ,
> >> > "taskId" : "1" ,
> >> > "started" : "2022-11-27T18:06:01.868+01:00" ,
> >> > "finished" : "2022-11-27T18:06:01.893+01:00" ,
> >> > "success" : true
> >> > }
> >> > ]
> >> >
> >> > -Bruno
> >> >
> >> > On Sun, 27 Nov 2022 at 17:55, Tim McIver  wrote:
> >> >
> >> >> I should mention also that the Docker image that I'm using in this case
> >> >> comes from here .
> >> >>
> >> >> On 11/27/22 11:43, Tim McIver wrote:
> >> >>> Hello,
> >> >>>
> >> >>> I'd like to backup my Fuseki data using the web API. I found
> >> >>> documentation about how to do that here
> >> >>> <
> >> >>
> >> https://jena.apache.org/documentation/fuseki2/fuseki-server-protocol.html#backup
> >> >.
> >> >>
> >> >>> But when I try use the listed endpoints, they all result in a 404.
> >> >>> I'm using curl from a container in a Docker network to do this. I
> >> >>> know that I can connect to the server because a call like "curl
> >> >>> http:/:3030/ds" returns data with content type
> >> >>> "application/trig".
> >> >>>
> >> >>> What am I missing? Any help would be appreciated.
> >> >>>
> >> >>> Tim
> >> >>>
> >>
> >


Re: Encoding URI component

2022-11-10 Thread Martynas Jusevičius
Got it.

I found the URL on SO: https://stackoverflow.com/a/67021655/1003113

On Thu, Nov 10, 2022 at 9:40 PM Andy Seaborne  wrote:
>
>
>
> On 10/11/2022 18:57, Martynas Jusevičius wrote:
> > Hi,
> >
> > What's the current way of encoding a URI component? E.g. query string
> > value, or a fragment.
>
> IRILib.encodeUriComponent
>
> >
> > I'm guessing there is something in the IRIx package, but its Javadoc is 
> > down:
> > https://jena.apache.org/documentation/javadoc/jena/org/apache/jena/irix/package-summary.html
>
> No it's not.
>
> Go into the javadoc and use the search.
>
> This is the way.
>
> https://jena.apache.org/documentation/javadoc/jena/org.apache.jena.core/org/apache/jena/irix/package-summary.html
>
>  Andy
>
> >
> > Martynas


Encoding URI component

2022-11-10 Thread Martynas Jusevičius
Hi,

What's the current way of encoding a URI component? E.g. query string
value, or a fragment.

I'm guessing there is something in the IRIx package, but its Javadoc is down:
https://jena.apache.org/documentation/javadoc/jena/org/apache/jena/irix/package-summary.html

Martynas


Re: Fuseki container is OOMKilled

2022-09-30 Thread Martynas Jusevičius
On Fri, Sep 30, 2022 at 5:06 PM Andy Seaborne  wrote:
>
> On 30/09/2022 12:39, Martynas Jusevičius wrote:
> > On Fri, Sep 30, 2022 at 11:40 AM Andy Seaborne  wrote:
> >>
> >> Depends on what "runs out of memory means".
> >
> > MEM% in docker stats reaches 100% and the container quits. Inspecting
> > it then shows OOMKilled: true.
>
> So you have a 10G container that crashes out with a max 8G heap that
> crashes out when the heap is at 40%. That does not add up.

I agree it's odd.

This article suggests that in Docker OOMKilled happens when "the max
heap size is incorrect (over the available memory limit)" and that
"The java.lang.OutOfMemoryError is different. That indicates that the
max heap size is not enough to hold all live objects in memory".
https://dzone.com/articles/why-my-java-application-is-oomkilled

But what takes up the memory then? And how can this be investigated?
The profiler setup is ready.

LinkedDataHub runs with mem_limit: 1872m in the same setup and never
gets OOMKilled, but it uses -XX:+UseContainerSupport
-XX:MaxRAMPercentage=75 instead of -Xmx/-Xms.

>
> The JVM takes may be 0.5G (VM, loaded classes etc).
> The kernel takes what? 0.5G tops. The rest is file system cache unless
> another process is involved.
>
> TDB uses the file system cache which flexes up and down and is mapped
> (virtual memory) into the address space of the JVM.
>
> Fuseki/TDB does not use off-heap same-process memory (IIRC Lucene might
> if available but this is fixed size.)
>
>  Andy
>
> >> If the container is being killed by the host, then it is likely some
> >> process in the container is asking for memory (sbrk), and there is a
> >> container limit (or ulimit?) being exceeded, so the container runtime or
> >> the host kills the container.
> >>
> >
> > Not sure if there's another way to check, but using top shows only the
> > java process running inside the container.
> >
> >> One source of sbrk is a JVM growing the heap. It is not the only source
> >> in the container OS.
> >>
> >> There is also the direct memory space in a JVM - it's fixed size (it
> >> sits below the heap).
> >>
> >
> > So if heap size was lowered (without changing mem_limit) then the
> > off-heap memory got an increase? And the issue went away, meaning it
> > wasn't large enough initially.
>  >
> >> Java avoids doing major GC when it can grow the heap.  Java will use
> >> heap until it approaches the heap limit rather than do a full GC so it
> >> can use more memory that it needs.  This is regardless of the memory
> >> working set size.  It sbrk's even when it does not need to - a
> >> time/space tradeoff.
> >
> > My observation in VisualVM that the heap size did not exceed ~40% of
> > the max size. When I had allocated 8GB the GC kicked in around 3GB,
> > when I allocated 4GB it kicked in around 1.5GB.
> >
> >>
> >> If it works at 4G Fuseki isn't leaking memory.
> >
> > Yes but it would be nice to have general guidelines for use in Docker
> > because this seems to be a recurring issue.
> >
> >>
> >> Is the container limited?
> >> Is the container runtime limited?
> >
> > As in mem_limit? As mentioned initially, mem_limit was at 10GB the whole 
> > time.
> > Dockerfile is the default Dockerfile that ships with Fuseki.
> >
> > I've committed a Dockerfile and settings that can be used to a remote
> > Fuseki from VisualVM using a JMX connection:
> > https://github.com/AtomGraph/fuseki-docker#profiling
> >
> >>
> >>   Andy
> >>
> >> On 30/09/2022 06:36, Lorenz Buehmann wrote:
> >>>   From my understanding, a larger heap space for Fuseki should only be
> >>> necessary when doing reasoning or e.g. loading the geospatial index. A
> >>> TDB database on the other hand is backed by memory mapped files, i.e.
> >>> makes use of off-heap memory and let the OS do all the work.
> >>>
> >>> Indeed, I cannot explain why assigning more heap makes the Fuseki thread
> >>> consume as much until OOM is reached.
> >>>
> >>> We also have a Fuseki Docker deployment, and we assigned Fuseki way more
> >>> memory (64GB) because of generating a large scale spatial index needs
> >>> all geometry objects in memory once. But it didn't crash because of what
> >>> you describe with the query load (maybe we never had such a constant 
> >>> load).
> >>>
> >>> Indeed, comparison is dif

Re: Fuseki container is OOMKilled

2022-09-30 Thread Martynas Jusevičius
On Fri, Sep 30, 2022 at 11:40 AM Andy Seaborne  wrote:
>
> Depends on what "runs out of memory means".

MEM% in docker stats reaches 100% and the container quits. Inspecting
it then shows OOMKilled: true.

>
> If the container is being killed by the host, then it is likely some
> process in the container is asking for memory (sbrk), and there is a
> container limit (or ulimit?) being exceeded, so the container runtime or
> the host kills the container.
>

Not sure if there's another way to check, but using top shows only the
java process running inside the container.

> One source of sbrk is a JVM growing the heap. It is not the only source
> in the container OS.
>
> There is also the direct memory space in a JVM - it's fixed size (it
> sits below the heap).
>

So if heap size was lowered (without changing mem_limit) then the
off-heap memory got an increase? And the issue went away, meaning it
wasn't large enough initially.

> Java avoids doing major GC when it can grow the heap.  Java will use
> heap until it approaches the heap limit rather than do a full GC so it
> can use more memory that it needs.  This is regardless of the memory
> working set size.  It sbrk's even when it does not need to - a
> time/space tradeoff.

My observation in VisualVM that the heap size did not exceed ~40% of
the max size. When I had allocated 8GB the GC kicked in around 3GB,
when I allocated 4GB it kicked in around 1.5GB.

>
> If it works at 4G Fuseki isn't leaking memory.

Yes but it would be nice to have general guidelines for use in Docker
because this seems to be a recurring issue.

>
> Is the container limited?
> Is the container runtime limited?

As in mem_limit? As mentioned initially, mem_limit was at 10GB the whole time.
Dockerfile is the default Dockerfile that ships with Fuseki.

I've committed a Dockerfile and settings that can be used to a remote
Fuseki from VisualVM using a JMX connection:
https://github.com/AtomGraph/fuseki-docker#profiling

>
>  Andy
>
> On 30/09/2022 06:36, Lorenz Buehmann wrote:
> >  From my understanding, a larger heap space for Fuseki should only be
> > necessary when doing reasoning or e.g. loading the geospatial index. A
> > TDB database on the other hand is backed by memory mapped files, i.e.
> > makes use of off-heap memory and let the OS do all the work.
> >
> > Indeed, I cannot explain why assigning more heap makes the Fuseki thread
> > consume as much until OOM is reached.
> >
> > We also have a Fuseki Docker deployment, and we assigned Fuseki way more
> > memory (64GB) because of generating a large scale spatial index needs
> > all geometry objects in memory once. But it didn't crash because of what
> > you describe with the query load (maybe we never had such a constant load).
> >
> > Indeed, comparison is difficult, different machines, different Docker
> > container, different Fuseki version ...
> >
> >
> > I think Andy will have better explanations and maybe also others like
> > Rob or people already using Fuseki@Docker
> >
> > On 29.09.22 16:45, Martynas Jusevičius wrote:
> >> Still hasn't crashed, so less heap could be the solution in this case.
> >>
> >> On Thu, Sep 29, 2022 at 3:12 PM Martynas Jusevičius
> >>  wrote:
> >>> I've lowered the heap size to 4GB to leave more off-heap memory (6GB).
> >>> It's been an hour and OOMKilled hasn't happened yet unlike before.
> >>> MEM% in docker stats peaks around 70%.
> >>>
> >>> On Thu, Sep 29, 2022 at 12:41 PM Martynas Jusevičius
> >>>  wrote:
> >>>> OK the findings are weird so far...
> >>>>
> >>>> Under constant query load on my local Docker, MEM% of the Fuseki
> >>>> container reached 100% within 45 minutes and it got OOMKilled.
> >>>>
> >>>> However, the Used heap "teeth" in VisualVM were below 3GB of the total
> >>>> ~8GB Heap size the whole time.
> >>>>
> >>>> What does that tell us?
> >>>>
> >>>>
> >>>> On Thu, Sep 29, 2022 at 11:58 AM Martynas Jusevičius
> >>>>  wrote:
> >>>>> Hi Eugen,
> >>>>>
> >>>>> I have the debugger working, I was trying to connect the profiler :)
> >>>>> Finally I managed to connect from VisualVM on Windows thanks to this
> >>>>> answer:
> >>>>> https://stackoverflow.com/questions/66222727/how-to-connect-to-jmx-server-running-inside-wsl2/71881475#71881475
> >>>>>
> >>>>>
> >>>>> I've launched an infinite curl

Re: Fuseki container is OOMKilled

2022-09-29 Thread Martynas Jusevičius
Still hasn't crashed, so less heap could be the solution in this case.

On Thu, Sep 29, 2022 at 3:12 PM Martynas Jusevičius
 wrote:
>
> I've lowered the heap size to 4GB to leave more off-heap memory (6GB).
> It's been an hour and OOMKilled hasn't happened yet unlike before.
> MEM% in docker stats peaks around 70%.
>
> On Thu, Sep 29, 2022 at 12:41 PM Martynas Jusevičius
>  wrote:
> >
> > OK the findings are weird so far...
> >
> > Under constant query load on my local Docker, MEM% of the Fuseki
> > container reached 100% within 45 minutes and it got OOMKilled.
> >
> > However, the Used heap "teeth" in VisualVM were below 3GB of the total
> > ~8GB Heap size the whole time.
> >
> > What does that tell us?
> >
> >
> > On Thu, Sep 29, 2022 at 11:58 AM Martynas Jusevičius
> >  wrote:
> > >
> > > Hi Eugen,
> > >
> > > I have the debugger working, I was trying to connect the profiler :)
> > > Finally I managed to connect from VisualVM on Windows thanks to this
> > > answer: 
> > > https://stackoverflow.com/questions/66222727/how-to-connect-to-jmx-server-running-inside-wsl2/71881475#71881475
> > >
> > > I've launched an infinite curl loop to create some query load, but
> > > what now? What should I be looking for in VisualVM?
> > >
> > > On Thu, Sep 29, 2022 at 11:33 AM Eugen Stan  
> > > wrote:
> > > >
> > > > For debugging, you need to do the following:
> > > >
> > > > * pass JVM options to enable debugging
> > > > * expose docker port for JVM debug you chose
> > > >
> > > > https://stackoverflow.com/questions/138511/what-are-java-command-line-options-to-set-to-allow-jvm-to-be-remotely-debugged
> > > >
> > > > You should be able to do all this without changing the image: docker env
> > > > variables and docker port option.
> > > >
> > > > Once container is started and port is listening, open (confirm with
> > > > docker ps) connect to it to debug.
> > > >
> > > > Good luck,
> > > >
> > > > On 29.09.2022 11:22, Martynas Jusevičius wrote:
> > > > > On Thu, Sep 29, 2022 at 9:41 AM Lorenz Buehmann
> > > > >  wrote:
> > > > >>
> > > > >> You're working on an in-memory dataset?
> > > > >
> > > > > No the datasets are TDB2-backed
> > > > >
> > > > >> Does it also happen with Jena 4.6.1?
> > > > >
> > > > > Don't know :)
> > > > >
> > > > > I wanted to run a profiler and tried connecting from VisualVM on
> > > > > Windows to the Fuseki container but neither jstatd nor JMX connections
> > > > > worked...
> > > > > Now I want to run VisualVM inside the container itself but this
> > > > > requires changing the Docker image in a way that I haven't figured out
> > > > > yet.
> > > > >
> > > > >>
> > > > >> On 28.09.22 20:23, Martynas Jusevičius wrote:
> > > > >>> Hi,
> > > > >>>
> > > > >>> We have a dockerized Fuseki 4.5.0 instance that is gradually running
> > > > >>> out of memory over the course of a few hours.
> > > > >>>
> > > > >>> 3 datasets, none larger than 10 triples. The load is negligible
> > > > >>> (maybe a few bursts x 10 simple queries per minute), no updates.
> > > > >>>
> > > > >>> Dockerfile: 
> > > > >>> https://github.com/AtomGraph/fuseki-docker/blob/master/Dockerfile
> > > > >>> Memory settings:
> > > > >>> mem_limit: 10240m
> > > > >>> JAVA_OPTIONS=-Xmx7700m -Xms7700m
> > > > >>>
> > > > >>> Any advice?
> > > > >>>
> > > > >>> Martynas
> > > >
> > > > --
> > > > Eugen Stan
> > > >
> > > > +40770 941 271  / https://www.netdava.com


Re: Fuseki container is OOMKilled

2022-09-29 Thread Martynas Jusevičius
I've lowered the heap size to 4GB to leave more off-heap memory (6GB).
It's been an hour and OOMKilled hasn't happened yet unlike before.
MEM% in docker stats peaks around 70%.

On Thu, Sep 29, 2022 at 12:41 PM Martynas Jusevičius
 wrote:
>
> OK the findings are weird so far...
>
> Under constant query load on my local Docker, MEM% of the Fuseki
> container reached 100% within 45 minutes and it got OOMKilled.
>
> However, the Used heap "teeth" in VisualVM were below 3GB of the total
> ~8GB Heap size the whole time.
>
> What does that tell us?
>
>
> On Thu, Sep 29, 2022 at 11:58 AM Martynas Jusevičius
>  wrote:
> >
> > Hi Eugen,
> >
> > I have the debugger working, I was trying to connect the profiler :)
> > Finally I managed to connect from VisualVM on Windows thanks to this
> > answer: 
> > https://stackoverflow.com/questions/66222727/how-to-connect-to-jmx-server-running-inside-wsl2/71881475#71881475
> >
> > I've launched an infinite curl loop to create some query load, but
> > what now? What should I be looking for in VisualVM?
> >
> > On Thu, Sep 29, 2022 at 11:33 AM Eugen Stan  wrote:
> > >
> > > For debugging, you need to do the following:
> > >
> > > * pass JVM options to enable debugging
> > > * expose docker port for JVM debug you chose
> > >
> > > https://stackoverflow.com/questions/138511/what-are-java-command-line-options-to-set-to-allow-jvm-to-be-remotely-debugged
> > >
> > > You should be able to do all this without changing the image: docker env
> > > variables and docker port option.
> > >
> > > Once container is started and port is listening, open (confirm with
> > > docker ps) connect to it to debug.
> > >
> > > Good luck,
> > >
> > > On 29.09.2022 11:22, Martynas Jusevičius wrote:
> > > > On Thu, Sep 29, 2022 at 9:41 AM Lorenz Buehmann
> > > >  wrote:
> > > >>
> > > >> You're working on an in-memory dataset?
> > > >
> > > > No the datasets are TDB2-backed
> > > >
> > > >> Does it also happen with Jena 4.6.1?
> > > >
> > > > Don't know :)
> > > >
> > > > I wanted to run a profiler and tried connecting from VisualVM on
> > > > Windows to the Fuseki container but neither jstatd nor JMX connections
> > > > worked...
> > > > Now I want to run VisualVM inside the container itself but this
> > > > requires changing the Docker image in a way that I haven't figured out
> > > > yet.
> > > >
> > > >>
> > > >> On 28.09.22 20:23, Martynas Jusevičius wrote:
> > > >>> Hi,
> > > >>>
> > > >>> We have a dockerized Fuseki 4.5.0 instance that is gradually running
> > > >>> out of memory over the course of a few hours.
> > > >>>
> > > >>> 3 datasets, none larger than 10 triples. The load is negligible
> > > >>> (maybe a few bursts x 10 simple queries per minute), no updates.
> > > >>>
> > > >>> Dockerfile: 
> > > >>> https://github.com/AtomGraph/fuseki-docker/blob/master/Dockerfile
> > > >>> Memory settings:
> > > >>> mem_limit: 10240m
> > > >>> JAVA_OPTIONS=-Xmx7700m -Xms7700m
> > > >>>
> > > >>> Any advice?
> > > >>>
> > > >>> Martynas
> > >
> > > --
> > > Eugen Stan
> > >
> > > +40770 941 271  / https://www.netdava.com


Re: Fuseki container is OOMKilled

2022-09-29 Thread Martynas Jusevičius
OK the findings are weird so far...

Under constant query load on my local Docker, MEM% of the Fuseki
container reached 100% within 45 minutes and it got OOMKilled.

However, the Used heap "teeth" in VisualVM were below 3GB of the total
~8GB Heap size the whole time.

What does that tell us?


On Thu, Sep 29, 2022 at 11:58 AM Martynas Jusevičius
 wrote:
>
> Hi Eugen,
>
> I have the debugger working, I was trying to connect the profiler :)
> Finally I managed to connect from VisualVM on Windows thanks to this
> answer: 
> https://stackoverflow.com/questions/66222727/how-to-connect-to-jmx-server-running-inside-wsl2/71881475#71881475
>
> I've launched an infinite curl loop to create some query load, but
> what now? What should I be looking for in VisualVM?
>
> On Thu, Sep 29, 2022 at 11:33 AM Eugen Stan  wrote:
> >
> > For debugging, you need to do the following:
> >
> > * pass JVM options to enable debugging
> > * expose docker port for JVM debug you chose
> >
> > https://stackoverflow.com/questions/138511/what-are-java-command-line-options-to-set-to-allow-jvm-to-be-remotely-debugged
> >
> > You should be able to do all this without changing the image: docker env
> > variables and docker port option.
> >
> > Once container is started and port is listening, open (confirm with
> > docker ps) connect to it to debug.
> >
> > Good luck,
> >
> > On 29.09.2022 11:22, Martynas Jusevičius wrote:
> > > On Thu, Sep 29, 2022 at 9:41 AM Lorenz Buehmann
> > >  wrote:
> > >>
> > >> You're working on an in-memory dataset?
> > >
> > > No the datasets are TDB2-backed
> > >
> > >> Does it also happen with Jena 4.6.1?
> > >
> > > Don't know :)
> > >
> > > I wanted to run a profiler and tried connecting from VisualVM on
> > > Windows to the Fuseki container but neither jstatd nor JMX connections
> > > worked...
> > > Now I want to run VisualVM inside the container itself but this
> > > requires changing the Docker image in a way that I haven't figured out
> > > yet.
> > >
> > >>
> > >> On 28.09.22 20:23, Martynas Jusevičius wrote:
> > >>> Hi,
> > >>>
> > >>> We have a dockerized Fuseki 4.5.0 instance that is gradually running
> > >>> out of memory over the course of a few hours.
> > >>>
> > >>> 3 datasets, none larger than 10 triples. The load is negligible
> > >>> (maybe a few bursts x 10 simple queries per minute), no updates.
> > >>>
> > >>> Dockerfile: 
> > >>> https://github.com/AtomGraph/fuseki-docker/blob/master/Dockerfile
> > >>> Memory settings:
> > >>> mem_limit: 10240m
> > >>> JAVA_OPTIONS=-Xmx7700m -Xms7700m
> > >>>
> > >>> Any advice?
> > >>>
> > >>> Martynas
> >
> > --
> > Eugen Stan
> >
> > +40770 941 271  / https://www.netdava.com


Re: Fuseki container is OOMKilled

2022-09-29 Thread Martynas Jusevičius
Hi Eugen,

I have the debugger working, I was trying to connect the profiler :)
Finally I managed to connect from VisualVM on Windows thanks to this
answer: 
https://stackoverflow.com/questions/66222727/how-to-connect-to-jmx-server-running-inside-wsl2/71881475#71881475

I've launched an infinite curl loop to create some query load, but
what now? What should I be looking for in VisualVM?

On Thu, Sep 29, 2022 at 11:33 AM Eugen Stan  wrote:
>
> For debugging, you need to do the following:
>
> * pass JVM options to enable debugging
> * expose docker port for JVM debug you chose
>
> https://stackoverflow.com/questions/138511/what-are-java-command-line-options-to-set-to-allow-jvm-to-be-remotely-debugged
>
> You should be able to do all this without changing the image: docker env
> variables and docker port option.
>
> Once container is started and port is listening, open (confirm with
> docker ps) connect to it to debug.
>
> Good luck,
>
> On 29.09.2022 11:22, Martynas Jusevičius wrote:
> > On Thu, Sep 29, 2022 at 9:41 AM Lorenz Buehmann
> >  wrote:
> >>
> >> You're working on an in-memory dataset?
> >
> > No the datasets are TDB2-backed
> >
> >> Does it also happen with Jena 4.6.1?
> >
> > Don't know :)
> >
> > I wanted to run a profiler and tried connecting from VisualVM on
> > Windows to the Fuseki container but neither jstatd nor JMX connections
> > worked...
> > Now I want to run VisualVM inside the container itself but this
> > requires changing the Docker image in a way that I haven't figured out
> > yet.
> >
> >>
> >> On 28.09.22 20:23, Martynas Jusevičius wrote:
> >>> Hi,
> >>>
> >>> We have a dockerized Fuseki 4.5.0 instance that is gradually running
> >>> out of memory over the course of a few hours.
> >>>
> >>> 3 datasets, none larger than 10 triples. The load is negligible
> >>> (maybe a few bursts x 10 simple queries per minute), no updates.
> >>>
> >>> Dockerfile: 
> >>> https://github.com/AtomGraph/fuseki-docker/blob/master/Dockerfile
> >>> Memory settings:
> >>> mem_limit: 10240m
> >>> JAVA_OPTIONS=-Xmx7700m -Xms7700m
> >>>
> >>> Any advice?
> >>>
> >>> Martynas
>
> --
> Eugen Stan
>
> +40770 941 271  / https://www.netdava.com


Re: Fuseki container is OOMKilled

2022-09-29 Thread Martynas Jusevičius
On Thu, Sep 29, 2022 at 9:41 AM Lorenz Buehmann
 wrote:
>
> You're working on an in-memory dataset?

No the datasets are TDB2-backed

> Does it also happen with Jena 4.6.1?

Don't know :)

I wanted to run a profiler and tried connecting from VisualVM on
Windows to the Fuseki container but neither jstatd nor JMX connections
worked...
Now I want to run VisualVM inside the container itself but this
requires changing the Docker image in a way that I haven't figured out
yet.

>
> On 28.09.22 20:23, Martynas Jusevičius wrote:
> > Hi,
> >
> > We have a dockerized Fuseki 4.5.0 instance that is gradually running
> > out of memory over the course of a few hours.
> >
> > 3 datasets, none larger than 10 triples. The load is negligible
> > (maybe a few bursts x 10 simple queries per minute), no updates.
> >
> > Dockerfile: 
> > https://github.com/AtomGraph/fuseki-docker/blob/master/Dockerfile
> > Memory settings:
> > mem_limit: 10240m
> > JAVA_OPTIONS=-Xmx7700m -Xms7700m
> >
> > Any advice?
> >
> > Martynas


Fuseki container is OOMKilled

2022-09-28 Thread Martynas Jusevičius
Hi,

We have a dockerized Fuseki 4.5.0 instance that is gradually running
out of memory over the course of a few hours.

3 datasets, none larger than 10 triples. The load is negligible
(maybe a few bursts x 10 simple queries per minute), no updates.

Dockerfile: https://github.com/AtomGraph/fuseki-docker/blob/master/Dockerfile
Memory settings:
mem_limit: 10240m
JAVA_OPTIONS=-Xmx7700m -Xms7700m

Any advice?

Martynas


Re: Re: TDB2 bulk loader - multiple files into different graph per file

2022-08-29 Thread Martynas Jusevičius
On Sun, Aug 28, 2022 at 11:00 AM Lorenz Buehmann
 wrote:
>
> Hi Andy,
>
> thanks for fast response.
>
> I see - the only drawback with wrapping the streams into TriG is when we
> have Turtle syntax files (or lets say any non N-Triples format) - afaik,
> prefixes aren't allowed inside graphs, i.e. at that point you're lost.
> What I did now is to pipe those files into riot first which then
> generates N-Triples which then can be wrapped in TriG graphs. Indeed, we
> have the riot overhead here, i.e. the data is parsed twice. Still faster
> though then loading graphs in separate TDB loader calls, so I guess I
> can live with this.

I had a similar question a few years ago, and Claus responded:
https://stackoverflow.com/questions/63467067/converting-rdf-triples-to-quads-from-command-line/63716278

>
> Having a follow up question:
>
> I could see a huge difference between read compressed (Bzip) vs
> uncompressed file:
>
> I put the output until the triples have been loaded here as the index
> creating should be affected by the compression:
>
>
> # uncompressed with tdb2.tdbloader
>
> 14:24:40 INFO  loader  :: Add: 163,000,000
> river_planet-latest.osm.pbf.ttl (Batch: 144,320 / Avg: 140,230)
> 14:24:42 INFO  loader  :: Finished:
> output/river_planet-latest.osm.pbf.ttl: 163,310,838 tuples in 1165.30s
> (Avg: 140,145)
>
>
> # compressed with tdb2.tdbloader
>
> 17:37:37 INFO  loader  :: Add: 163,000,000
> river_planet-latest.osm.pbf.ttl.bz2 (Batch: 19,424 / Avg: 16,050)
> 17:37:40 INFO  loader  :: Finished:
> output/river_planet-latest.osm.pbf.ttl.bz2: 163,310,838 tuples in
> 10158.16s (Avg: 16,076)
>
>
> So loading the compressed file is ~9x slower then the compressed one.
> Can we consider this as expected? Note, here we have a geospatial
> dataset with millions of geometry literals. Not sure if this is also
> something that makes things worse.
>
> What are your experiences with loading compressed vs uncompressed data?
>
>
> Cheers,
>
> Lorenz
>
>
> On 26.08.22 17:02, Andy Seaborne wrote:
> > Hi Lorenz,
> >
> > No - there isn't an option.
> >
> > The way to do it is to prepare the load as quads by, for example,
> > wrapping in TriG syntax around the files or adding the G to N-triples.
> >
> > This can be done streaming and piped into the loader (with --syntax=
> > if not N-quads).
> >
> > > By the way, the tdb2.xloader has no option for named graphs at all?
> >
> > The input needs to be prepared as quads.
> >
> > Andy
> >
> > On 26/08/2022 15:03, Lorenz Buehmann wrote:
> >> Hi all,
> >>
> >> is there any option to use TDB2 bulk loader (tdb2.xloader or just
> >> tdb2.loader) to load multiple files into multiple different named
> >> graphs? Like
> >>
> >> tdb2.loader --loc ./tdb2/dataset --graph  file1 --graph 
> >> file2 ...
> >>
> >> I'm asking because I thought the initial loading is way faster then
> >> iterating over multiple (graph, file) pairs and running the TDB2
> >> loader for each pair?
> >>
> >>
> >> By the way, the tdb2.xloader has no option for named graphs at all?
> >>
> >>
> >> Cheers,
> >>
> >> Lorenz
> >>


Re: Cannot invoke "String.equals(Object)" because "prefix" is null

2022-07-13 Thread Martynas Jusevičius
On a further look I can see that Relation::backwards returns null
because "http://www.w3.org/2001/XMLSchema#; is not in the cols map.
https://github.com/apache/jena/blob/main/jena-core/src/main/java/org/apache/jena/rdfxml/xmloutput/impl/Relation.java#L204

That's where my investigation ends, I'm afraid...

On Wed, Jul 13, 2022 at 5:51 PM Martynas Jusevičius
 wrote:
>
> I can see that this.getPrefixFor( value ) returns null, where value is
> "http://www.w3.org/2001/XMLSchema#;.
> https://github.com/apache/jena/blob/main/jena-core/src/main/java/org/apache/jena/rdfxml/xmloutput/impl/BaseXMLWriter.java#L229
>
> No idea why though.
>
> On Wed, Jul 13, 2022 at 5:03 PM Martynas Jusevičius
>  wrote:
> >
> > Hi,
> >
> > Never seen this exception before and not sure yet what causes it:
> >
> > java.lang.NullPointerException: Cannot invoke "String.equals(Object)"
> > because "prefix" is null
> > org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.checkLegalPrefix(BaseXMLWriter.java:845)
> > org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.setNsPrefix(BaseXMLWriter.java:324)
> > org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.primeNamespace(BaseXMLWriter.java:232)
> > org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.setupNamespaces(BaseXMLWriter.java:482)
> > org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.write(BaseXMLWriter.java:466)
> > org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.write(BaseXMLWriter.java:456)
> > com.atomgraph.linkeddatahub.server.io.ValidatingModelProvider.write(ValidatingModelProvider.java:129)
> >
> > Any hints?


Re: Cannot invoke "String.equals(Object)" because "prefix" is null

2022-07-13 Thread Martynas Jusevičius
I can see that this.getPrefixFor( value ) returns null, where value is
"http://www.w3.org/2001/XMLSchema#;.
https://github.com/apache/jena/blob/main/jena-core/src/main/java/org/apache/jena/rdfxml/xmloutput/impl/BaseXMLWriter.java#L229

No idea why though.

On Wed, Jul 13, 2022 at 5:03 PM Martynas Jusevičius
 wrote:
>
> Hi,
>
> Never seen this exception before and not sure yet what causes it:
>
> java.lang.NullPointerException: Cannot invoke "String.equals(Object)"
> because "prefix" is null
> org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.checkLegalPrefix(BaseXMLWriter.java:845)
> org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.setNsPrefix(BaseXMLWriter.java:324)
> org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.primeNamespace(BaseXMLWriter.java:232)
> org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.setupNamespaces(BaseXMLWriter.java:482)
> org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.write(BaseXMLWriter.java:466)
> org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.write(BaseXMLWriter.java:456)
> com.atomgraph.linkeddatahub.server.io.ValidatingModelProvider.write(ValidatingModelProvider.java:129)
>
> Any hints?


Cannot invoke "String.equals(Object)" because "prefix" is null

2022-07-13 Thread Martynas Jusevičius
Hi,

Never seen this exception before and not sure yet what causes it:

java.lang.NullPointerException: Cannot invoke "String.equals(Object)"
because "prefix" is null
org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.checkLegalPrefix(BaseXMLWriter.java:845)
org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.setNsPrefix(BaseXMLWriter.java:324)
org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.primeNamespace(BaseXMLWriter.java:232)
org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.setupNamespaces(BaseXMLWriter.java:482)
org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.write(BaseXMLWriter.java:466)
org.apache.jena.rdfxml.xmloutput.impl.BaseXMLWriter.write(BaseXMLWriter.java:456)
com.atomgraph.linkeddatahub.server.io.ValidatingModelProvider.write(ValidatingModelProvider.java:129)

Any hints?


Re: Fuseki https certificate problems

2022-07-07 Thread Martynas Jusevičius
Can't you just provide a keystore password?

https://stackoverflow.com/questions/12862655/using-an-empty-keystore-password-used-to-be-possible

On Thu, Jul 7, 2022 at 11:42 AM Andy Seaborne  wrote:
>
> Hi Nikolaos,
>
>
> On 06/07/2022 11:04, Nikolaos Beredimas wrote:
> > While trying to get Fuseki running over https I found this thread from
> > February
> > https://jena.markmail.org/message/2kqpd2tlinpdzpna?q=ssl+order:date-backward=1
> >
> > 1. I can confirm the provided xml works (tested on Fuseki 4.5.0)
>
> Thanks for confirming that.
>
> >
> > 2. I am having some issues generating the needed pkcs12 certificate file.
> >
> > a. When trying to generate a password-less pkcs12 file (openssl ...
> > -passout pass:) Fuseki doesn't complain when loading it, but I always get
> > SSL handshake errors and it doesn't work.
>
> It is Jetty that is handling the certificate via the JDK.
>
> Mentions like
>
> https://stackoverflow.com/questions/58345405/how-to-use-non-password-protected-p12-ssl-certificate-in-spring-boot
>
> (which is nearly 3 years old)
>
> suggest a password was needed at some time in the past. Current jetty
> documentation does not mention it one way of the other.
>
> > b. When trying to generate with a password I get mixed results:
> > OpenSSL 1.1.1f  31 Mar 2020 running on WSL2 Ubuntu 20.04 works fine. Fuseki
> > loads the certificate and works like a charm.
> > However, if I use OpenSSL 1.1.1o  3 May 2022 (running on
> > docker-linuxserver/docker-swag:latest) I get a strange exception stacktrace:
> >
> > java.io.IOException: keystore password was incorrect
> > at sun.security.pkcs12.PKCS12KeyStore.engineLoad(Unknown Source) ~[?:?]
> > at sun.security.util.KeyStoreDelegator.engineLoad(Unknown Source) ~[?:?]
> > at java.security.KeyStore.load(Unknown Source) ~[?:?]
> > at
> > org.eclipse.jetty.util.security.CertificateUtils.getKeyStore(CertificateUtils.java:49)
> > ~[fuseki-server.jar:4.5.0]
> > ...
> > Caused by: java.security.UnrecoverableKeyException: failed to decrypt safe
> > contents entry: javax.crypto.BadPaddingException: Given final block not
> > properly padded. Such issues can arise if a bad key is used during
> > decryption.
> > ... 28 more
>
> I'm afraid I don't know what that indicates.
>
> >
> >
> > I would appreciate any input to pinpoint and solve any or both issues above.
>
> We'd be interested in hearing what you find out.
>
> >
> > Regards,
> > Nikolaos Beredimas
> >


Re: Restricting SPARQL update to a single named graph

2022-06-13 Thread Martynas Jusevičius
On Fri, Jun 10, 2022 at 5:13 PM Martynas Jusevičius
 wrote:
>
> On Tue, Jun 7, 2022 at 9:15 PM Andy Seaborne  wrote:
> >
> > On 07/06/2022 10:47, Martynas Jusevičius wrote:
> > > Hi,
> > >
> > > I have implemented PATCH method for the Graph Store Protocol:
> > > https://www.w3.org/TR/sparql11-http-rdf-update/#http-patch
> > >
> > > The PATCH is applied to a named graph. I am missing this bit however:
> > > " If a SPARQL 1.1 Update request is used as the RDF payload for a
> > > PATCH request that makes changes to more than one graph or the graph
> > > it modifies is not the one indicated, it would be prudent for the
> > > server to respond with a 422 Unprocessable Entity status."
> >
> > I read that in the context of GSP resource naming.
> >
> > ?graph=
> >
> > and so the update does not name a graph - it'll look like the default
> > graph in the update.
> >
> > So look for GRAPH in the update.
> >
> > > What would be the way to make sure that an update only affects a
> > > single specific graph?
> >
> > A dataset of one graph and no others. c.f. DatasetGraphOne but for a
> > single named graph and read-only dft graph.
> >
> > Or a dataset which yields read-only graphs except for the target graph.
> >
> > Or analyse the update - no GRAPH in templates if the target comes from
> > the URL.
>
> It seems that it's not so easy to check for GRAPH in the update after all...
>
> What is the way to "analyse the update - no GRAPH in templates" that
> you speak of? I need to check both DELETE and INSERT templates.
>
> I thought I had found a way:
>
>   updateRequest.getOperations().get(0).getDeleteAcc().getGraph()
>
> but it returns  for the following update,
> which probably means it doesn't do what I think it does:
>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>
> WITH <https://localhost:4443/>
> INSERT {
>   GRAPH ?g {
> <https://localhost:4443/> rdf:_2 <https://localhost:4443/#whateverest> .
>   }
> }
> WHERE
>   { GRAPH ?g
>   { ?s  ?p  ?o }
>   }

I think I figured it out. In case anyone needs it:

public class PatchUpdateVisitor extends UpdateVisitorBase
{

private boolean containsNamedGraph = false;

@Override
public void visit(UpdateModify update)
{
update.getDeleteAcc().getQuads().forEach(quad ->
{
if (!quad.getGraph().equals(Quad.defaultGraphNodeGenerated))
containsNamedGraph = true;
});
update.getInsertAcc().getQuads().forEach(quad ->
{
if (!quad.getGraph().equals(Quad.defaultGraphNodeGenerated))
containsNamedGraph = true;
});
}

public boolean containsNamedGraph()
{
return containsNamedGraph;
}

}

>
> >
> > >
> > >
> > > Martynas
> > > atomgraph.com


Re: Restricting SPARQL update to a single named graph

2022-06-10 Thread Martynas Jusevičius
On Tue, Jun 7, 2022 at 9:15 PM Andy Seaborne  wrote:
>
> On 07/06/2022 10:47, Martynas Jusevičius wrote:
> > Hi,
> >
> > I have implemented PATCH method for the Graph Store Protocol:
> > https://www.w3.org/TR/sparql11-http-rdf-update/#http-patch
> >
> > The PATCH is applied to a named graph. I am missing this bit however:
> > " If a SPARQL 1.1 Update request is used as the RDF payload for a
> > PATCH request that makes changes to more than one graph or the graph
> > it modifies is not the one indicated, it would be prudent for the
> > server to respond with a 422 Unprocessable Entity status."
>
> I read that in the context of GSP resource naming.
>
> ?graph=
>
> and so the update does not name a graph - it'll look like the default
> graph in the update.
>
> So look for GRAPH in the update.
>
> > What would be the way to make sure that an update only affects a
> > single specific graph?
>
> A dataset of one graph and no others. c.f. DatasetGraphOne but for a
> single named graph and read-only dft graph.
>
> Or a dataset which yields read-only graphs except for the target graph.
>
> Or analyse the update - no GRAPH in templates if the target comes from
> the URL.

It seems that it's not so easy to check for GRAPH in the update after all...

What is the way to "analyse the update - no GRAPH in templates" that
you speak of? I need to check both DELETE and INSERT templates.

I thought I had found a way:

  updateRequest.getOperations().get(0).getDeleteAcc().getGraph()

but it returns  for the following update,
which probably means it doesn't do what I think it does:

PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

WITH <https://localhost:4443/>
INSERT {
  GRAPH ?g {
<https://localhost:4443/> rdf:_2 <https://localhost:4443/#whateverest> .
  }
}
WHERE
  { GRAPH ?g
  { ?s  ?p  ?o }
  }

>
> >
> >
> > Martynas
> > atomgraph.com


Re: Restricting SPARQL update to a single named graph

2022-06-09 Thread Martynas Jusevičius
That was a naive implementation... This one based on regex is somewhat
more robust:
https://github.com/AtomGraph/LinkedDataHub/blob/develop/src/main/java/com/atomgraph/linkeddatahub/server/model/impl/GraphStoreImpl.java#L322

If anyone has examples of how to do a similar thing without string
manipulation, that would be appreciated.

On Thu, Jun 9, 2022 at 12:21 PM Martynas Jusevičius
 wrote:
>
> On Thu, Jun 9, 2022 at 11:03 AM Martynas Jusevičius
>  wrote:
> >
> > On Wed, Jun 8, 2022 at 12:22 PM Andy Seaborne  wrote:
> > >
> > >
> > >
> > > On 08/06/2022 09:22, Martynas Jusevičius wrote:
> > > > On Tue, Jun 7, 2022 at 9:15 PM Andy Seaborne  wrote:
> > > >>
> > > >> On 07/06/2022 10:47, Martynas Jusevičius wrote:
> > > >>> Hi,
> > > >>>
> > > >>> I have implemented PATCH method for the Graph Store Protocol:
> > > >>> https://www.w3.org/TR/sparql11-http-rdf-update/#http-patch
> > > >>>
> > > >>> The PATCH is applied to a named graph. I am missing this bit however:
> > > >>> " If a SPARQL 1.1 Update request is used as the RDF payload for a
> > > >>> PATCH request that makes changes to more than one graph or the graph
> > > >>> it modifies is not the one indicated, it would be prudent for the
> > > >>> server to respond with a 422 Unprocessable Entity status."
> > > >>
> > > >> I read that in the context of GSP resource naming.
> > > >>
> > > >> ?graph=
> > > >>
> > > >> and so the update does not name a graph - it'll look like the default
> > > >> graph in the update.
> > > >>
> > > >> So look for GRAPH in the update.
> > > >
> > > > Thanks, that makes sense. GRAPH is also easy to check.
> > > >
> > > > But then I need to forward the update to a triplestore that does not
> > > > support PATCH.
> > >
> > > It's the GSP naming that matters.
> > >
> > > > Which means I would need to wrap INSERT/DELETE/WHERE templates into
> > > > GRAPH  { }.
> > >
> > > WITH  DELETE {} INSERT {} WHERE {}
> > >
> > > Also: USING. And protocol.
> >
> > Thanks, forgot about those. I could definitely use WITH.
> >
> > And re. SPARQL protocol, would ?using-named-graph-uri=uri have the
> > same effect as WITH  in this case?
>
> I realized ?using-named-graph-uri=uri would restrict the dataset but
> still require GRAPH  in the update string.
>
> WITH  on the other hand does what I need. I add it by string
> manipulation -- the current code is ugly but seems to do the job:
>
> String updateString = updateRequest.toString();
> // append WITH  before DELETE or INSERT
> if (updateString.toUpperCase().contains("DELETE"))
>updateString = updateString.replaceAll("(?i)" +
> Pattern.quote("DELETE"), "WITH <" + graphUri + ">\nDELETE");
> else
> {
> if (updateString.toUpperCase().contains("INSERT"))
> updateString = updateString.replaceAll("(?i)" +
> Pattern.quote("INSERT"), "WITH <" + graphUri + ">\nINSERT");
> else throw new BadRequestException("SPARQL update contains
> no DELETE or INSERT?"); // cannot happen
> }
> updateRequest = UpdateFactory.create(updateString);
>
> >
> > >
> > > > Is there some builder code that can help with that?
> > >
> > > Have you looked at UpdateBuilder?
> > >
> >
> > I looked at the Javadoc, it seems quite complicated. I'll see if I can
> > avoid modifying the update string.
> >
> >
> > > >
> > > >>
> > > >>> What would be the way to make sure that an update only affects a
> > > >>> single specific graph?
> > > >>
> > > >> A dataset of one graph and no others. c.f. DatasetGraphOne but for a
> > > >> single named graph and read-only dft graph.
> > > >>
> > > >> Or a dataset which yields read-only graphs except for the target graph.
> > > >>
> > > >> Or analyse the update - no GRAPH in templates if the target comes from
> > > >> the URL.
> > > >>
> > > >>>
> > > >>>
> > > >>> Martynas
> > > >>> atomgraph.com


Re: Restricting SPARQL update to a single named graph

2022-06-09 Thread Martynas Jusevičius
On Thu, Jun 9, 2022 at 11:03 AM Martynas Jusevičius
 wrote:
>
> On Wed, Jun 8, 2022 at 12:22 PM Andy Seaborne  wrote:
> >
> >
> >
> > On 08/06/2022 09:22, Martynas Jusevičius wrote:
> > > On Tue, Jun 7, 2022 at 9:15 PM Andy Seaborne  wrote:
> > >>
> > >> On 07/06/2022 10:47, Martynas Jusevičius wrote:
> > >>> Hi,
> > >>>
> > >>> I have implemented PATCH method for the Graph Store Protocol:
> > >>> https://www.w3.org/TR/sparql11-http-rdf-update/#http-patch
> > >>>
> > >>> The PATCH is applied to a named graph. I am missing this bit however:
> > >>> " If a SPARQL 1.1 Update request is used as the RDF payload for a
> > >>> PATCH request that makes changes to more than one graph or the graph
> > >>> it modifies is not the one indicated, it would be prudent for the
> > >>> server to respond with a 422 Unprocessable Entity status."
> > >>
> > >> I read that in the context of GSP resource naming.
> > >>
> > >> ?graph=
> > >>
> > >> and so the update does not name a graph - it'll look like the default
> > >> graph in the update.
> > >>
> > >> So look for GRAPH in the update.
> > >
> > > Thanks, that makes sense. GRAPH is also easy to check.
> > >
> > > But then I need to forward the update to a triplestore that does not
> > > support PATCH.
> >
> > It's the GSP naming that matters.
> >
> > > Which means I would need to wrap INSERT/DELETE/WHERE templates into
> > > GRAPH  { }.
> >
> > WITH  DELETE {} INSERT {} WHERE {}
> >
> > Also: USING. And protocol.
>
> Thanks, forgot about those. I could definitely use WITH.
>
> And re. SPARQL protocol, would ?using-named-graph-uri=uri have the
> same effect as WITH  in this case?

I realized ?using-named-graph-uri=uri would restrict the dataset but
still require GRAPH  in the update string.

WITH  on the other hand does what I need. I add it by string
manipulation -- the current code is ugly but seems to do the job:

String updateString = updateRequest.toString();
// append WITH  before DELETE or INSERT
if (updateString.toUpperCase().contains("DELETE"))
   updateString = updateString.replaceAll("(?i)" +
Pattern.quote("DELETE"), "WITH <" + graphUri + ">\nDELETE");
else
{
if (updateString.toUpperCase().contains("INSERT"))
updateString = updateString.replaceAll("(?i)" +
Pattern.quote("INSERT"), "WITH <" + graphUri + ">\nINSERT");
else throw new BadRequestException("SPARQL update contains
no DELETE or INSERT?"); // cannot happen
}
updateRequest = UpdateFactory.create(updateString);

>
> >
> > > Is there some builder code that can help with that?
> >
> > Have you looked at UpdateBuilder?
> >
>
> I looked at the Javadoc, it seems quite complicated. I'll see if I can
> avoid modifying the update string.
>
>
> > >
> > >>
> > >>> What would be the way to make sure that an update only affects a
> > >>> single specific graph?
> > >>
> > >> A dataset of one graph and no others. c.f. DatasetGraphOne but for a
> > >> single named graph and read-only dft graph.
> > >>
> > >> Or a dataset which yields read-only graphs except for the target graph.
> > >>
> > >> Or analyse the update - no GRAPH in templates if the target comes from
> > >> the URL.
> > >>
> > >>>
> > >>>
> > >>> Martynas
> > >>> atomgraph.com


Re: Restricting SPARQL update to a single named graph

2022-06-09 Thread Martynas Jusevičius
On Wed, Jun 8, 2022 at 12:22 PM Andy Seaborne  wrote:
>
>
>
> On 08/06/2022 09:22, Martynas Jusevičius wrote:
> > On Tue, Jun 7, 2022 at 9:15 PM Andy Seaborne  wrote:
> >>
> >> On 07/06/2022 10:47, Martynas Jusevičius wrote:
> >>> Hi,
> >>>
> >>> I have implemented PATCH method for the Graph Store Protocol:
> >>> https://www.w3.org/TR/sparql11-http-rdf-update/#http-patch
> >>>
> >>> The PATCH is applied to a named graph. I am missing this bit however:
> >>> " If a SPARQL 1.1 Update request is used as the RDF payload for a
> >>> PATCH request that makes changes to more than one graph or the graph
> >>> it modifies is not the one indicated, it would be prudent for the
> >>> server to respond with a 422 Unprocessable Entity status."
> >>
> >> I read that in the context of GSP resource naming.
> >>
> >> ?graph=
> >>
> >> and so the update does not name a graph - it'll look like the default
> >> graph in the update.
> >>
> >> So look for GRAPH in the update.
> >
> > Thanks, that makes sense. GRAPH is also easy to check.
> >
> > But then I need to forward the update to a triplestore that does not
> > support PATCH.
>
> It's the GSP naming that matters.
>
> > Which means I would need to wrap INSERT/DELETE/WHERE templates into
> > GRAPH  { }.
>
> WITH  DELETE {} INSERT {} WHERE {}
>
> Also: USING. And protocol.

Thanks, forgot about those. I could definitely use WITH.

And re. SPARQL protocol, would ?using-named-graph-uri=uri have the
same effect as WITH  in this case?

>
> > Is there some builder code that can help with that?
>
> Have you looked at UpdateBuilder?
>

I looked at the Javadoc, it seems quite complicated. I'll see if I can
avoid modifying the update string.


> >
> >>
> >>> What would be the way to make sure that an update only affects a
> >>> single specific graph?
> >>
> >> A dataset of one graph and no others. c.f. DatasetGraphOne but for a
> >> single named graph and read-only dft graph.
> >>
> >> Or a dataset which yields read-only graphs except for the target graph.
> >>
> >> Or analyse the update - no GRAPH in templates if the target comes from
> >> the URL.
> >>
> >>>
> >>>
> >>> Martynas
> >>> atomgraph.com


Re: Restricting SPARQL update to a single named graph

2022-06-08 Thread Martynas Jusevičius
On Tue, Jun 7, 2022 at 9:15 PM Andy Seaborne  wrote:
>
> On 07/06/2022 10:47, Martynas Jusevičius wrote:
> > Hi,
> >
> > I have implemented PATCH method for the Graph Store Protocol:
> > https://www.w3.org/TR/sparql11-http-rdf-update/#http-patch
> >
> > The PATCH is applied to a named graph. I am missing this bit however:
> > " If a SPARQL 1.1 Update request is used as the RDF payload for a
> > PATCH request that makes changes to more than one graph or the graph
> > it modifies is not the one indicated, it would be prudent for the
> > server to respond with a 422 Unprocessable Entity status."
>
> I read that in the context of GSP resource naming.
>
> ?graph=
>
> and so the update does not name a graph - it'll look like the default
> graph in the update.
>
> So look for GRAPH in the update.

Thanks, that makes sense. GRAPH is also easy to check.

But then I need to forward the update to a triplestore that does not
support PATCH.
Which means I would need to wrap INSERT/DELETE/WHERE templates into
GRAPH  { }.
Is there some builder code that can help with that?

>
> > What would be the way to make sure that an update only affects a
> > single specific graph?
>
> A dataset of one graph and no others. c.f. DatasetGraphOne but for a
> single named graph and read-only dft graph.
>
> Or a dataset which yields read-only graphs except for the target graph.
>
> Or analyse the update - no GRAPH in templates if the target comes from
> the URL.
>
> >
> >
> > Martynas
> > atomgraph.com


Restricting SPARQL update to a single named graph

2022-06-07 Thread Martynas Jusevičius
Hi,

I have implemented PATCH method for the Graph Store Protocol:
https://www.w3.org/TR/sparql11-http-rdf-update/#http-patch

The PATCH is applied to a named graph. I am missing this bit however:
" If a SPARQL 1.1 Update request is used as the RDF payload for a
PATCH request that makes changes to more than one graph or the graph
it modifies is not the one indicated, it would be prudent for the
server to respond with a 422 Unprocessable Entity status."

What would be the way to make sure that an update only affects a
single specific graph?


Martynas
atomgraph.com


Re: SHACLC and RDFLanguages

2022-05-23 Thread Martynas Jusevičius
Could RDFParserRegistry::getRegistered and
ResultSetReaderRegistry::getRegistered be added?

On Fri, May 20, 2022 at 9:01 PM Andy Seaborne  wrote:
>
>
>
> On 20/05/2022 14:05, Martynas Jusevičius wrote:
> > Andy, is that correct?
>
> Yes
>
>  Andy
>
> >
> > On Tue, May 17, 2022 at 1:33 PM Martynas Jusevičius
> >  wrote:
> >>
> >> On Tue, May 17, 2022 at 1:19 PM Andy Seaborne  wrote:
> >>>
> >>> RDFLanguages is a general registry of names (Lang's) in the system.
> >>>
> >>> It is not for functionality.
> >>>
> >>> RDFParserRegistry
> >>> RDFWriterRegistry
> >>> RowSetReaderRegistry, ResultSetReaderRegistry
> >>> RowSetWriterRegistry, ResultSetWriterRegistry
> >>> StreamRDFWriter
> >>>
> >>> A Lang needs looking up in a registry to see if there is support for it.
> >>
> >> Thanks, I didn't know these existed.
> >>
> >> But there are no RDFParserRegistry::getRegistered or
> >> ResultSetReaderRegistry::getRegistered methods?
> >>
> >> So do I still need to iterate RDFLanguages::getRegistered and check
> >> each Lang against
> >> RDFParserRegistry::isRegistered/ResultSetReaderRegistry::isRegistered?
> >>
> >>>
> >>>   Andy
> >>>
> >>> On 17/05/2022 09:54, Martynas Jusevičius wrote:
> >>>> Hi,
> >>>>
> >>>> After upgrading from 4.3.2 to 4.5.0, some of our RDF writing code
> >>>> started failing.
> >>>>
> >>>> It seems that this is due to RDFLanguages.isTriples(Lang.SHACLC)
> >>>> returning true, which messes up our content negotiation as it attempts
> >>>> to write Models as SHACLC. Can this be rectified?
> >>>>
> >>>> The RDFLanguages registry is a bit of an oxymoron in general. Right
> >>>> now it's a bag of all sorts of syntaxes Jena supports, half of which
> >>>> are not even "RDF languages". We need to iterate and filter the
> >>>> languages just to know which ones can be used to read/write Models,
> >>>> which can be used for ResultSets etc.:
> >>>> https://github.com/AtomGraph/Core/blob/master/src/main/java/com/atomgraph/core/MediaTypes.java#L86
> >>>> Wouldn't it make sense to have separate registries depending on the
> >>>> entity types they apply to?
> >>>>
> >>>> Thanks.
> >>>>
> >>>> Martynas


Re: SHACLC and RDFLanguages

2022-05-20 Thread Martynas Jusevičius
Andy, is that correct?

On Tue, May 17, 2022 at 1:33 PM Martynas Jusevičius
 wrote:
>
> On Tue, May 17, 2022 at 1:19 PM Andy Seaborne  wrote:
> >
> > RDFLanguages is a general registry of names (Lang's) in the system.
> >
> > It is not for functionality.
> >
> > RDFParserRegistry
> > RDFWriterRegistry
> > RowSetReaderRegistry, ResultSetReaderRegistry
> > RowSetWriterRegistry, ResultSetWriterRegistry
> > StreamRDFWriter
> >
> > A Lang needs looking up in a registry to see if there is support for it.
>
> Thanks, I didn't know these existed.
>
> But there are no RDFParserRegistry::getRegistered or
> ResultSetReaderRegistry::getRegistered methods?
>
> So do I still need to iterate RDFLanguages::getRegistered and check
> each Lang against
> RDFParserRegistry::isRegistered/ResultSetReaderRegistry::isRegistered?
>
> >
> >  Andy
> >
> > On 17/05/2022 09:54, Martynas Jusevičius wrote:
> > > Hi,
> > >
> > > After upgrading from 4.3.2 to 4.5.0, some of our RDF writing code
> > > started failing.
> > >
> > > It seems that this is due to RDFLanguages.isTriples(Lang.SHACLC)
> > > returning true, which messes up our content negotiation as it attempts
> > > to write Models as SHACLC. Can this be rectified?
> > >
> > > The RDFLanguages registry is a bit of an oxymoron in general. Right
> > > now it's a bag of all sorts of syntaxes Jena supports, half of which
> > > are not even "RDF languages". We need to iterate and filter the
> > > languages just to know which ones can be used to read/write Models,
> > > which can be used for ResultSets etc.:
> > > https://github.com/AtomGraph/Core/blob/master/src/main/java/com/atomgraph/core/MediaTypes.java#L86
> > > Wouldn't it make sense to have separate registries depending on the
> > > entity types they apply to?
> > >
> > > Thanks.
> > >
> > > Martynas


Re: xsd:DateTime not working

2022-05-19 Thread Martynas Jusevičius
Works for me:

PREFIX xsd: 
select * where {
  bind (xsd:dateTime("2014-06-05T10:10:10+05:00") as ?asDate)
}

http://sparql.org/sparql?query=PREFIX+xsd%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23%3E%0D%0Aselect+*+where+%7B%0D%0A++bind+%28xsd%3AdateTime%28%222014-06-05T10%3A10%3A10%2B05%3A00%22%29+as+%3FasDate%29%0D%0A%7D==text=%2Fxml-to-html.xsl

In your example:
- xsd:dateTime() cased wrong
- no ":" in "0500"

On Thu, May 19, 2022 at 4:33 PM Erich Bremer  wrote:
>
> I'm trying to do a sparql update on a Model in Jena.
>
> The below query, against Virtuoso will create "2014-06-05T10:10:10"
> correctly, but I get null using Jena's SPARQL.
>
> PREFIX xsd: 
> select * where {graph ?g {?s ?p ?o}
>   bind (xsd:DateTime("2014-06-05T10:10:10") as ?asDate)
> }
> limit 3
>
> Even changing the source string to "2014-06-05T10:10:10+0500", will still
> yield nothing from the Jena conversion.
>
>- Erich


Re: SHACLC and RDFLanguages

2022-05-17 Thread Martynas Jusevičius
On Tue, May 17, 2022 at 1:19 PM Andy Seaborne  wrote:
>
> RDFLanguages is a general registry of names (Lang's) in the system.
>
> It is not for functionality.
>
> RDFParserRegistry
> RDFWriterRegistry
> RowSetReaderRegistry, ResultSetReaderRegistry
> RowSetWriterRegistry, ResultSetWriterRegistry
> StreamRDFWriter
>
> A Lang needs looking up in a registry to see if there is support for it.

Thanks, I didn't know these existed.

But there are no RDFParserRegistry::getRegistered or
ResultSetReaderRegistry::getRegistered methods?

So do I still need to iterate RDFLanguages::getRegistered and check
each Lang against
RDFParserRegistry::isRegistered/ResultSetReaderRegistry::isRegistered?

>
>      Andy
>
> On 17/05/2022 09:54, Martynas Jusevičius wrote:
> > Hi,
> >
> > After upgrading from 4.3.2 to 4.5.0, some of our RDF writing code
> > started failing.
> >
> > It seems that this is due to RDFLanguages.isTriples(Lang.SHACLC)
> > returning true, which messes up our content negotiation as it attempts
> > to write Models as SHACLC. Can this be rectified?
> >
> > The RDFLanguages registry is a bit of an oxymoron in general. Right
> > now it's a bag of all sorts of syntaxes Jena supports, half of which
> > are not even "RDF languages". We need to iterate and filter the
> > languages just to know which ones can be used to read/write Models,
> > which can be used for ResultSets etc.:
> > https://github.com/AtomGraph/Core/blob/master/src/main/java/com/atomgraph/core/MediaTypes.java#L86
> > Wouldn't it make sense to have separate registries depending on the
> > entity types they apply to?
> >
> > Thanks.
> >
> > Martynas


SHACLC and RDFLanguages

2022-05-17 Thread Martynas Jusevičius
Hi,

After upgrading from 4.3.2 to 4.5.0, some of our RDF writing code
started failing.

It seems that this is due to RDFLanguages.isTriples(Lang.SHACLC)
returning true, which messes up our content negotiation as it attempts
to write Models as SHACLC. Can this be rectified?

The RDFLanguages registry is a bit of an oxymoron in general. Right
now it's a bag of all sorts of syntaxes Jena supports, half of which
are not even "RDF languages". We need to iterate and filter the
languages just to know which ones can be used to read/write Models,
which can be used for ResultSets etc.:
https://github.com/AtomGraph/Core/blob/master/src/main/java/com/atomgraph/core/MediaTypes.java#L86
Wouldn't it make sense to have separate registries depending on the
entity types they apply to?

Thanks.

Martynas


Re: Apache Jena - 10 years as an Apache Project.

2022-04-19 Thread Martynas Jusevičius
Congrats :) User since 2006 I think.

On Tue, Apr 19, 2022 at 9:50 PM  wrote:
>
> It's especially impressive because we are thriving with new contributions.
> Go us!
>
> Adam
>
> On Mon, Apr 18, 2022, 3:51 PM Dan Brickley  wrote:
>
> > On Mon, 18 Apr 2022 at 17:39, Andy Seaborne  wrote:
> >
> > > Today is the 10th anniversary of Apache Jena as a Top Level Project of
> > > the Apache Software Foundation!
> >
> >
> > Congratulations! That’s quite the mliestone.
> >
> > Dan
> >


Re: querying lots of quad files in block storage

2022-04-14 Thread Martynas Jusevičius
There was a related thread
https://www.mail-archive.com/users@jena.apache.org/msg18577.html

On Thu, 14 Apr 2022 at 22.42, Justin  wrote:

> Hello,
>
> I am looking to see if Jena is a good fit for querying many billion quads
> (in thousands of .nq files) sitting in block storage (like AWS S3). The .nq
> files don't change. New .nq files do get added to S3, however. Also update
> queries are not needed -- just selects, constructs, asks, etc.
>
> It would be easy to iterate over all the files and produce TDB2s in a
> filesystem (on AWS EBS or EFS)...
>
> Has anyone gone down this path and have some wisdom to share?
> I understand queries won't be as snappy as querying a single TDB2.
>
> Thanks,
> Justin
>


Re: Ontology URI vs document URI

2022-04-03 Thread Martynas Jusevičius
On Fri, Apr 1, 2022 at 1:13 PM Andy Seaborne  wrote:
>
>
>
> On 26/03/2022 15:46, Martynas Jusevičius wrote:
> > Hi,
> >
> > Using the ontology API, if one owl:imports an ontology URI such as
> > <http://www.w3.org/ns/org#> into ontology model, the imported model
> > gets cached under the "http://www.w3.org/ns/org#; key.
> >
> > However, given
> >
> >  <http://www.w3.org/ns/org#> a owl:Ontology
> >
> > one can argue that this URI is of the ontology resource, but what gets
> > loaded and cached is more than that -- it is the ontology *document*.
> > This relates to the old debate whether ontology instances should
> > contain the trailing # in their URIs, i.e. whether ontology and its
> > document is the same resource or distinct resources.
> >
> > The problem is that despite the ontology being cached, code that
> > attempts to dereference its document URI will not be able to make use
> > of it as "http://www.w3.org/ns/org; will not match the
> > "http://www.w3.org/ns/org#; cache key.
>
> This seems inconsistent of imports.
>
> You started with
>  > one owl:imports an ontology URI such as
>  > <http://www.w3.org/ns/org#>
>
> so I don't understand where http://www.w3.org/ns/org comes in.

It comes from the Linked Data browser. The HTTP client dereferences
<http://www.w3.org/ns/org> because that is the document URI.

If the Org ontology is being dereferenced, but it's already been
imported using owl:imports, the browser should be able to take
advantage of the cached model (i.e. document). But it's not directly
connected to the OWL code and is not aware that this document contains
an ontology resource, it can only access the model cache.

At worse,
> things are cached twice, under two different names. There needs to be
> consistency in nmaing - same if http://www.w3.org/ns/org# and
> http://example/myCopy/org

There is no consistency in the general case, as I've tried to explain,
because some ontology URIs equal their document URIs (like SPIN) and
some don't (have a trailing #, like the Org ontology).

So far I've worked by adding a second cache key with the document URI:

OntModel ontModel =
ModelFactory.createOntologyModel(ontModelSpec, baseModel);
ontModel.getDocumentManager().addModel(uri, ontModel,
true); // add as OntModel so that imports do not need to be reloaded
during retrieval
// make sure to cache imported models not only by
ontology URI but also by document URI

ontModel.listImportedOntologyURIs(true).forEach((String importURI) ->
{
try
{
URI ontologyURI = URI.create(importURI);
// remove fragment and normalize
URI docURI = new URI(ontologyURI.getScheme(),
ontologyURI.getSchemeSpecificPart(), null).normalize();
String mappedURI =
ontModelSpec.getDocumentManager().getFileManager().mapURI(docURI.toString());
 // only cache import document URI if it's not
already cached or mapped
if
(!ontModelSpec.getDocumentManager().getFileManager().hasCachedModel(docURI.toString())
&& mappedURI.equals(docURI.toString()))
{
Model importModel =
ontModel.getDocumentManager().getModel(importURI);
if (importModel == null) throw new
IllegalArgumentException("Import model is not cached");

ontModel.getDocumentManager().addModel(docURI.toString(), importModel,
true);
}
}
catch (URISyntaxException ex)
{
throw new RuntimeException(ex);
}
});

>
> Your description says that http://www.w3.org/ns/org# is the import name.
>
> There is the LocationMapper that may help your application.

LocationMapper is unrelated IMO. I could solve this issue if there was
a way to specify or rewrite the import model cache key in the ODM (I
would use a convention where it equals the document URI without
fragment).

>
>  Andy
>
> >
> > My questions are:
> > 1. How can I work around this? Using some kind of post-loadImports()
> > hook to make cache entries for both URIs?
> > 2. Shouldn't the OWL import code be aware of this and cache the
> > document URI while using it as a base for all relative URIs (of
> > ontology terms)?
>
> The base has the same effect. Fragments are stripped on relative URI
> resolution.
>
> >
> > Martynas
> > atomgraph.com


Re: Ontology URI vs document URI

2022-03-26 Thread Martynas Jusevičius
It doesn't look like the ReadHook is applied on the cache key, unfortunately.

The problematic OntDocumentManager code is here:
https://github.com/apache/jena/blob/main/jena-core/src/main/java/org/apache/jena/ontology/OntDocumentManager.java#L983

On Sat, Mar 26, 2022 at 5:09 PM Martynas Jusevičius
 wrote:
>
> Could OntDocumentManager.ReadHook be used for this?
> https://jena.apache.org/documentation/javadoc/jena/org/apache/jena/ontology/OntDocumentManager.ReadHook.html
>
> On Sat, Mar 26, 2022 at 4:46 PM Martynas Jusevičius
>  wrote:
> >
> > Hi,
> >
> > Using the ontology API, if one owl:imports an ontology URI such as
> > <http://www.w3.org/ns/org#> into ontology model, the imported model
> > gets cached under the "http://www.w3.org/ns/org#; key.
> >
> > However, given
> >
> > <http://www.w3.org/ns/org#> a owl:Ontology
> >
> > one can argue that this URI is of the ontology resource, but what gets
> > loaded and cached is more than that -- it is the ontology *document*.
> > This relates to the old debate whether ontology instances should
> > contain the trailing # in their URIs, i.e. whether ontology and its
> > document is the same resource or distinct resources.
> >
> > The problem is that despite the ontology being cached, code that
> > attempts to dereference its document URI will not be able to make use
> > of it as "http://www.w3.org/ns/org; will not match the
> > "http://www.w3.org/ns/org#; cache key.
> >
> > My questions are:
> > 1. How can I work around this? Using some kind of post-loadImports()
> > hook to make cache entries for both URIs?
> > 2. Shouldn't the OWL import code be aware of this and cache the
> > document URI while using it as a base for all relative URIs (of
> > ontology terms)?
> >
> > Martynas
> > atomgraph.com


Re: Ontology URI vs document URI

2022-03-26 Thread Martynas Jusevičius
Could OntDocumentManager.ReadHook be used for this?
https://jena.apache.org/documentation/javadoc/jena/org/apache/jena/ontology/OntDocumentManager.ReadHook.html

On Sat, Mar 26, 2022 at 4:46 PM Martynas Jusevičius
 wrote:
>
> Hi,
>
> Using the ontology API, if one owl:imports an ontology URI such as
> <http://www.w3.org/ns/org#> into ontology model, the imported model
> gets cached under the "http://www.w3.org/ns/org#; key.
>
> However, given
>
> <http://www.w3.org/ns/org#> a owl:Ontology
>
> one can argue that this URI is of the ontology resource, but what gets
> loaded and cached is more than that -- it is the ontology *document*.
> This relates to the old debate whether ontology instances should
> contain the trailing # in their URIs, i.e. whether ontology and its
> document is the same resource or distinct resources.
>
> The problem is that despite the ontology being cached, code that
> attempts to dereference its document URI will not be able to make use
> of it as "http://www.w3.org/ns/org; will not match the
> "http://www.w3.org/ns/org#; cache key.
>
> My questions are:
> 1. How can I work around this? Using some kind of post-loadImports()
> hook to make cache entries for both URIs?
> 2. Shouldn't the OWL import code be aware of this and cache the
> document URI while using it as a base for all relative URIs (of
> ontology terms)?
>
> Martynas
> atomgraph.com


Ontology URI vs document URI

2022-03-26 Thread Martynas Jusevičius
Hi,

Using the ontology API, if one owl:imports an ontology URI such as
 into ontology model, the imported model
gets cached under the "http://www.w3.org/ns/org#; key.

However, given

 a owl:Ontology

one can argue that this URI is of the ontology resource, but what gets
loaded and cached is more than that -- it is the ontology *document*.
This relates to the old debate whether ontology instances should
contain the trailing # in their URIs, i.e. whether ontology and its
document is the same resource or distinct resources.

The problem is that despite the ontology being cached, code that
attempts to dereference its document URI will not be able to make use
of it as "http://www.w3.org/ns/org; will not match the
"http://www.w3.org/ns/org#; cache key.

My questions are:
1. How can I work around this? Using some kind of post-loadImports()
hook to make cache entries for both URIs?
2. Shouldn't the OWL import code be aware of this and cache the
document URI while using it as a base for all relative URIs (of
ontology terms)?

Martynas
atomgraph.com


Re: SPARQL optional limiting results

2022-03-18 Thread Martynas Jusevičius
Can you provide a full query string and a data sample that illustrate
the problem? Then it's easy to see what's going on, for example on
http://sparql.org/sparql.html.

On Fri, Mar 18, 2022 at 11:52 AM Mikael Pesonen
 wrote:
>
>
> Is this a problem with query, not with Jena?
>
> On 15/03/2022 9.30, Lorenz Buehmann wrote:
> > Hi,
> >
> > I'm probably misunderstanding the query, but what is the purpose of
> > the OPTIONAL here?
> >
> > ?graph is bound because of VALUES clause, ?concept is bound because of
> > the graph pattern before the OPTIONAL as well.
> >
> > So ?graph and ?concept are bound on the left hand side of the
> > left-join aka OPTIONAL
> >
> > Here is the algebra:
> >
> > (join
> >   (table (vars ?graph)
> > (row [?graph])
> > (row [?graph])
> >   )
> >   (assign ((?graph ?*g0))
> > (leftjoin
> >   (distinct
> > (project (?concept ?prefLabelm ?altLabelm)
> >   (filter (= (lang ?prefLabelm) "fi")
> > (quadpattern
> >   (quad ?*g0 ??0 rdf:first ?concept)
> >   (quad ?*g0 ??0 rdf:rest ??1)
> >   (quad ?*g0 ??1 rdf:first ?score1)
> >   (quad ?*g0 ??1 rdf:rest ??2)
> >   (quad ?*g0 ??2 rdf:first ?prefLabelm)
> >   (quad ?*g0 ??2 rdf:rest rdf:nil)
> >   (quad ?*g0 ??0 text:query ??3)
> >   (quad ?*g0 ??3 rdf:first skos:prefLabel)
> >   (quad ?*g0 ??3 rdf:rest ??4)
> >   (quad ?*g0 ??4 rdf:first "aamiainen*")
> >   (quad ?*g0 ??4 rdf:rest rdf:nil)
> > 
> >   (sequence
> > (graph ?*g0
> >   (path ?concept (path* skos:broader) ??5))
> > (quadpattern (quad ?*g0 ??5 skos:topConceptOf ?graph)
> >
> >
> > Can you say what you want to achieve with the OPTIONAL maybe, it won't
> > return any additional data as far as I can see.
> >
> > On 14.03.22 14:30, Mikael Pesonen wrote:
> >> Hi, not directly related to Jena, but I have a query in which
> >> optional clause limits the number of results. I thought it's never
> >> possible. So below query returns less results with optional enabled.
> >> Wonder why is that and what would be the correct way to get optional
> >> data so than all rows are returned?
> >>
> >> SELECT *
> >> WHERE
> >> {
> >> VALUES ?graph {
> >> }
> >> GRAPH ?graph
> >> {
> >> {
> >> SELECT DISTINCT ?concept ?prefLabelm ?altLabelm WHERE
> >> {
> >> {
> >> (?concept ?score1 ?prefLabelm) text:query
> >> (skos:prefLabel "aamiainen*") .
> >> FILTER ( (lang(?prefLabelm) = "fi" ))
> >> }
> >> }
> >> }
> >># OPTIONAL { ?concept skos:broader* [ skos:topConceptOf ?graph] }
> >> }
> >> }
>
> --
> Lingsoft - 30 years of Leading Language Management
>
> www.lingsoft.fi
>
> Speech Applications - Language Management - Translation - Reader's and 
> Writer's Tools - Text Tools - E-books and M-books
>
> Mikael Pesonen
> System Engineer
>
> e-mail: mikael.peso...@lingsoft.fi
> Tel. +358 2 279 3300
>
> Time zone: GMT+2
>
> Helsinki Office
> Eteläranta 10
> FI-00130 Helsinki
> FINLAND
>
> Turku Office
> Kauppiaskatu 5 A
> FI-20100 Turku
> FINLAND
>


Re: [4.3.2] Cannot invoke "org.apache.jena.rdf.model.Property.asNode()" because "org.apache.jena.vocabulary.RDF.type" is null

2022-03-15 Thread Martynas Jusevičius
Hi,

On Wed, Mar 9, 2022 at 12:34 PM Andy Seaborne  wrote:
>
>
>
> On 09/03/2022 11:16, Martynas Jusevičius wrote:
> > Hi,
> >
> > This appeared after Java upgrade from 11 to 17:
> >
> > WARN LocationMapper:188 - Error in configuration file: Cannot invoke
> > "org.apache.jena.rdf.model.Property.asNode()" because
> > "org.apache.jena.vocabulary.RDF.type" is null
>
> May be init related ... depends when it happened in the app.

Most likely is as it happens very early in the startup.
I haven't noticed it with Java 11 though, pretty sure it wasn't there.

>
> Always good to call
> JenaSystem.init
> before any Jena code is touched if you can. It makes the whole thing
> deterministic.
>

JenaSystem.init() is called in a ServletContextListener.

> > I was looking at the LocationMapper code, but line 188 does not
> > contain anything like that:
> > https://github.com/apache/jena/blob/main/jena-core/src/main/java/org/apache/jena/util/LocationMapper.java#L188
>
> Wrong location manager?

We have a custom PrefixMapper, but from the first look line #188
doesn't look related either:
https://github.com/AtomGraph/Web-Client/blob/master/src/main/java/com/atomgraph/client/locator/PrefixMapper.java#L188
Or is it?

>
> Look at any stacktraces.

There is no stack trace.

>
> Run 4.4.0.

Will do soon :)

>
> >
> > What is the cause and does this need to be addressed?
> >
> > Martynas


Re: JSON-LD: 1.0 or 1.1

2022-03-15 Thread Martynas Jusevičius
Hi,

Are the output "flavors" for JSON-LD 1.0 only then?
https://jena.apache.org/documentation/io/rdf-output.html#json-ld

On Fri, Mar 11, 2022 at 11:39 AM Andy Seaborne  wrote:
>
> Jena has both JSON 1.0, provided by jsonld-java, and JSON-LD 1.1,
> provided by Titanium.
>
> What should the default settings be?
>
> For parsing that means what is bound to "application/ld+json" and file
> extension .jsonld.
>
> For writing, it means what is setup for Lang.JSONLD.
>
> This is two decisions - parsing and writing can be different.
>
>
> But.
>
> It is not so simple:
>
> 1/ For Java11, the default settings for java.net.http can't contact
> schema.org.
>
> This is not a problem with Titanium.
>
> See example code that uses only the plain java.net.http package.
>
>https://issues.apache.org/jira/browse/JENA-2306
>
> HTTP/2 works for java 17.0.2, but does not work for java 11.0.14.
>
> And that means { "@context": "https://schema.org/; } does not work.
>
> (http: redirects to that as well)
>
> This _may_ be:
> https://bugs.openjdk.java.net/browse/JDK-8218546
>
> in which case it is fixed in java 11.0.15.
>
> In my testing, it wasn't just google HTTP/2 sites.
>
> ** Please try the example code and report the Java version and what happens.
>
> 2/ Jena is writing JSON-LD 1.1 without much in the way of transformation
> nor creating a @context from the RDF data. It prints full URIs; numbers
> aren't abbreviated etc etc. so it not very pretty.
>
> Or should Jena only write some plain output and expect the application
> to further transform it?
>
> Even for this, it needs some improvement so that there is a @context to
> work from.
>
> ** Can you contribute here? The code is in JsonLD11Writer.java.
>
>  https://issues.apache.org/jira/browse/JENA-2153
>
>  Andy


[4.3.2] Cannot invoke "org.apache.jena.rdf.model.Property.asNode()" because "org.apache.jena.vocabulary.RDF.type" is null

2022-03-09 Thread Martynas Jusevičius
Hi,

This appeared after Java upgrade from 11 to 17:

WARN LocationMapper:188 - Error in configuration file: Cannot invoke
"org.apache.jena.rdf.model.Property.asNode()" because
"org.apache.jena.vocabulary.RDF.type" is null

I was looking at the LocationMapper code, but line 188 does not
contain anything like that:
https://github.com/apache/jena/blob/main/jena-core/src/main/java/org/apache/jena/util/LocationMapper.java#L188

What is the cause and does this need to be addressed?

Martynas


Re: Can Fuseki SHACL-validate data before inserting them into the graph?

2022-03-06 Thread Martynas Jusevičius
Processor can validate request payloads against SPIN and SHACL. The
constraints are defined as part of the LDT ontology.
https://github.com/AtomGraph/Processor

On Sat, 5 Mar 2022 at 19.35, Andy Seaborne  wrote:

>
>
> On 04/03/2022 17:24, Moritz Orth wrote:
> > Hello everyone,
> >
> > I’m currently playing around with SHACL in Jena and just asked myself:
> Can Fuseki validate data against SHACL shapes prior to inserting them into
> the graph, refusing to add them when they don’t conform with the shapes?
>
> No, not currently.
>
>  Andy
>
> >
> > The docs under https://jena.apache.org/documentation/shacl/index.html <
> https://jena.apache.org/documentation/shacl/index.html> show that
> something like this is possible using the Java API directly, using
> GraphValidation.update(). Fuseki allows for creating a SHACL validation
> report after inserting some data, however I cannot see a possibility to
> achieve this kind of transaction rollback behaviour that I do with the API.
> >
> > Is there some kind of operation mode for Fuseki that allows you to
> specify some SHACL shapes on startup, and then validate all triple
> insertion requests against those shapes?
> >
> > Thanks in advance for some guidance on the topic.
> >
> > Best regards
> > Moritz
> >
> >
>


Re: Fuseki context path?

2022-02-14 Thread Martynas Jusevičius
Adam,

Why not use the WAR file then in a servlet container?

On Mon, 14 Feb 2022 at 21.59,  wrote:

> I'm afraid that doesn't work because I'm interested in proxying the entire
> application, not a single dataset. I want to expose the whole UI, admin,
> SPARQL editor and all.
>
> I've tried proxying as you describe using --localhost, but the static
> resources and JavaScript that compose the UI don't come through properly
> when I have a path fragment on the other side a la:
>
> ProxyPass /fuseki http://localhost:3030
>
>  I'd really rather not get into rewriting HTML! I was hoping for a simple:
>
> ProxyPass /fuseki http://localhost:3030/fuseki
>
> style of action.
>
> Does that make sense?
>
> Adam
>
>
> On Mon, Feb 14, 2022, 2:27 PM Andy Seaborne  wrote:
>
> >
> >
> > On 14/02/2022 17:30, aj...@apache.org wrote:
> > > I'm probably missing something obvious, because I haven't looked at
> > Fuseki
> > > in quite some time. I cannot seem to find any way to set the servlet
> > > context path for Fuseki in its standalone (non-WAR) incarnation, which
> I
> > > want to do in order to get it proxied behind httpd.
> >
> > For Fuseki standalone server (in the download) and Fuseki Main:
> >
> > Set the name of the dataset to a path. The name can have a "/" in it but
> > it seems to need the service name to help it distinguish between the
> > "sparql" query service and /some/path/dataset thinking "dataset" is the
> > service (routing has been decided before the named services are
> > available to inspect).
> >
> > fuseki-server /some/path/dataset/sparql
> >
> > Is that enough for you?
> >
> > BTW:
> >
> > One way to proxy is to run it on a known port and then use --localhost -
> > the Fuseki server then will only talk to HTTP traffic on the localhost
> > interface (IPv4 or IPv6), not to directly sent traffic.
> >
> >  Andy
> >
> > > Is there a setting here, or will I have to define a Jetty configuration
> > (in
> > > which case, do we have an example available?)?
> > >
> > > Thanks for any info!
> > >
> > > Adam
> > >
> >
>


Re: Disabling BNode UID generation

2022-02-04 Thread Martynas Jusevičius
Hi Ryan,

Isn't it easier to skolemize the bnodes into URIs that you control?

If you only have URIs, then you could even hash the graph with SPARQL:
https://stackoverflow.com/questions/65798817/how-to-generate-a-hash-of-an-rdf-graph-using-sparql
It works but probably doesn't scale that well.

Martynas

On Fri, Feb 4, 2022 at 8:09 PM Shaw, Ryan  wrote:
>
> Hello,
>
> I am trying to experiment with generating diffable N-Triples or flat Turtle 
> files.
>
> I was hoping that I could do this by setting 
> JenaParameters.disableBNodeUIDGeneration to true, so that blank nodes would 
> be assigned IDs in increasing order as the parser created them. But it seems 
> that only some methods of blank node creation respect this setting. When I 
> parse Turtle with BNode UID generation disabled, I get a mix of `ANNN` 
> (incremented, as expected) and random UUID BNode IDs. When I parse N-Triples 
> I get all random UUIDs.
>
> Is there any way to tap into the parsing pipeline to ensure that all BNode 
> IDs are deterministically (ideally incrementally) generated?
>
> Thanks,
> Ryan


Re: Trying to count the properties used for each class

2022-01-24 Thread Martynas Jusevičius
You're counting the same thing you're grouping by. I think you need:

SELECT ?c (COUNT(DISTINCT ?p) AS ?pcount)
WHERE {
   ?s a ?c .
   ?s ?p ?o .
}
GROUP BY ?c

http://sparql.org/sparql?query=SELECT+%3Fc+%28COUNT%28DISTINCT+%3Fp%29+AS+%3Fpcount%29%0D%0AWHERE+%7B%0D%0A+++%3Fs+a+%3Fc+.%0D%0A+++%3Fs+%3Fp+%3Fo+.%0D%0A%7D%0D%0AGROUP+BY+%3Fc=http%3A%2F%2Fwww.snee.com%2Fbobdc.blog%2Ffiles%2FBeatlesMusicians.ttl=text=%2Fxml-to-html.xsl

On Tue, Jan 25, 2022 at 12:05 AM Bob DuCharme  wrote:
>
> Using arq and the data at
> http://www.snee.com/bobdc.blog/files/BeatlesMusicians.ttl, I’m trying to
> write a query that will list the classes used in the data and the number
> of distinct properties used by instances of that class. I’m having a
> hard time and can’t even write a query that lists the number of
> properties used for just one of the classes; the following just shows me
> a series of ones.
>
> SELECT (COUNT(DISTINCT ?p) AS ?pcount)
> WHERE {
>?s a  .
>?s ?p ?o .
> }
> GROUP BY ?p
>
> Any suggestions?
>
> Thanks,
>
> Bob
>


Mapping multiple files into the same namespace

2022-01-24 Thread Martynas Jusevičius
Hi,

I want to merge multiple RDF files into a single ontology under one
namespace URI. E.g. to add custom assertions to ontologies without
touching their original files.

Can LocationMapper/FileManager be made/extended to do this, or do I
need to roll something of my own?

Martynas
atomgraph.com


Re: Dynamically restricting graph access at SPARQL query time

2022-01-24 Thread Martynas Jusevičius
You're more than welcome :)

On Mon, Jan 24, 2022 at 3:41 PM Vilnis Termanis
 wrote:
>
> Hi Martynas,
>
> Thank you very much for the suggestion (and additional information 
> out-of-band).
> I've been having a look at LinkedDataHub and will come back to you
> with some questions, if you don't mind.
>
> Regards,
> Vilnis
>
> On Fri, 21 Jan 2022 at 15:26, Martynas Jusevičius
>  wrote:
> >
> > WebAccessControl ontology might be relevant here:
> > https://www.w3.org/wiki/WebAccessControl
> > We're using a request filter that controls access against
> > authorizations using SPARQL.
> >
> > On Fri, Jan 21, 2022 at 4:13 PM Vilnis Termanis
> >  wrote:
> > >
> > > Hi,
> > >
> > > For a SPARQL query via Fuseki, we are trying to restrict visibility of
> > > groups of triples (each with multiple subjects) dynamically, in order
> > > to allow for generic queries to be executed by users (instead of
> > > providing tinned ones).
> > >
> > > Looking at the available ACL mechanisms in Jena/Fuseki, I assume
> > > storing each of these groups as a distinct graph might be the way
> > > forward. (The expectation is to be able to support 10^5 or higher
> > > number of these.)
> > >
> > > I.e.: Given a user (external to Fuseki, e.g. presented via shiro via
> > > LDAP/other), only consider triples from the set of graphs 1..N during
> > > the query. (Where the allowed list of 1..N graphs is to be looked up
> > > at the point of the query.)
> > >
> > > From my limited understanding, some potential routes are:
> > >
> > > a) jena-fuseki-access - Filters triples at storage level via "TDB Quad
> > > Filter" support in TDB.
> > > However, the configuration of allowed graphs per user is static at 
> > > runtime.
> > >
> > > b) jena-permissions - Extends the SPARQL query engine with an Op
> > > rewriter which allows a user-defined evalulator implementation to
> > > allow/deny access to a graph/triple, given a specific user/principle.
> > > (The specific yes/no evaluation responses are cached for the duration
> > > of a query/operation.)
> > > However, this can only applied to a single graph as it stands.
> > >
> > > c) Parse & re-write the query to e.g. scope it using a fixed set of
> > > "FROM" clauses. From some minimal testing (with ~200 FROM clauses)
> > > this does not appear to perform well (compare to a tinned query which
> > > explicitly restricts access via knowledge of the ontologies involved).
> > > I appreciate that maybe having a large list of FROM clauses is an
> > > anti-pattern.
> > >
> > > My questions are:
> > >
> > > 1) Does filtering to a set of subset of graphs (from a large set of
> > > graphs) to restrict access sounds like a sensible thing to do? (Note
> > > that each of these graphs would contain a set of multiple subjects -
> > > i.e. we are not trying filter by specific predicate/object values.)
> > >
> > > 2) Would extending either jena-fuseki-access to support the
> > > user-graph-list lookup dynamically OR extend jena-permissions to work
> > > at dataset level be sensible things to do?
> > >
> > > 3) If the answer to either of (2) is yes - I'd be interested in
> > > getting a better understanding of what would be involved to gauge the
> > > size/effort of such an extension. I have had a look codebases for the
> > > aforementioned projects, but my knowledge of TDB/ARQ/etc is very
> > > limited. (We'd potentially be interested in taking this on, time &
> > > priorities permitting.)
> > >
> > > I didn't know which mailing list to send this to but I thought the
> > > users list would probably be a better starting point.
> > >
> > > Regards,
> > > Vilnis
> > >
> > > --
> > > Vilnis Termanis
> > > Senior Software Developer
> > >
> > > e | vilnis.terma...@iotics.com
> > > www.iotics.com
>
>
>
> --
> Vilnis Termanis
> Senior Software Developer
>
> m | +44 (0) 7521 012309
> e | vilnis.terma...@iotics.com
> www.iotics.com
>
> The information contained in this email is strictly confidential and
> intended only for the parties noted. If this email was not intended
> for your use, please contact Iotics. For more on our Privacy Policy
> please visit https://www.iotics.com/legal/


Re: Replacing FileManager with Dataset

2022-01-22 Thread Martynas Jusevičius
I meant  map in FileManager...

On Sat, Jan 22, 2022 at 9:14 PM Martynas Jusevičius
 wrote:
>
> Hi,
>
> We are using FileManager as part of the Ontology API. We noticed it's
> Model cache related methods are marked deprecated:
> https://jena.apache.org/documentation/javadoc/jena/org/apache/jena/util/FileManager.html
>
> I don't see how OntDocumentManager can work without some sort of model
> cache. But I think that FileManager (or at least the caching part)
> could be replaced with a Dataset implementation.  cache
> map in FileManager is essentially the same as the named graphs in the
> Dataset.
>
> One immediate advantage would be that all OntDocumentManager
> ontologies would be accessible using SPARQL. We have a use case to
> make the queryable which lead to this idea.
>
> I've made a PoC implementation. It uses DataManager as a subclass of
> FileManager with getModelCache() exposed as an immutable map, because
> FileManager itself provides no way to list the entries in the cache
> map.
> https://github.com/AtomGraph/LinkedDataHub/blob/develop/src/main/java/com/atomgraph/linkeddatahub/server/util/DataManagerDataset.java
>
> Thoughts?
>
> Martynas


Replacing FileManager with Dataset

2022-01-22 Thread Martynas Jusevičius
Hi,

We are using FileManager as part of the Ontology API. We noticed it's
Model cache related methods are marked deprecated:
https://jena.apache.org/documentation/javadoc/jena/org/apache/jena/util/FileManager.html

I don't see how OntDocumentManager can work without some sort of model
cache. But I think that FileManager (or at least the caching part)
could be replaced with a Dataset implementation.  cache
map in FileManager is essentially the same as the named graphs in the
Dataset.

One immediate advantage would be that all OntDocumentManager
ontologies would be accessible using SPARQL. We have a use case to
make the queryable which lead to this idea.

I've made a PoC implementation. It uses DataManager as a subclass of
FileManager with getModelCache() exposed as an immutable map, because
FileManager itself provides no way to list the entries in the cache
map.
https://github.com/AtomGraph/LinkedDataHub/blob/develop/src/main/java/com/atomgraph/linkeddatahub/server/util/DataManagerDataset.java

Thoughts?

Martynas


Re: Dynamically restricting graph access at SPARQL query time

2022-01-21 Thread Martynas Jusevičius
WebAccessControl ontology might be relevant here:
https://www.w3.org/wiki/WebAccessControl
We're using a request filter that controls access against
authorizations using SPARQL.

On Fri, Jan 21, 2022 at 4:13 PM Vilnis Termanis
 wrote:
>
> Hi,
>
> For a SPARQL query via Fuseki, we are trying to restrict visibility of
> groups of triples (each with multiple subjects) dynamically, in order
> to allow for generic queries to be executed by users (instead of
> providing tinned ones).
>
> Looking at the available ACL mechanisms in Jena/Fuseki, I assume
> storing each of these groups as a distinct graph might be the way
> forward. (The expectation is to be able to support 10^5 or higher
> number of these.)
>
> I.e.: Given a user (external to Fuseki, e.g. presented via shiro via
> LDAP/other), only consider triples from the set of graphs 1..N during
> the query. (Where the allowed list of 1..N graphs is to be looked up
> at the point of the query.)
>
> From my limited understanding, some potential routes are:
>
> a) jena-fuseki-access - Filters triples at storage level via "TDB Quad
> Filter" support in TDB.
> However, the configuration of allowed graphs per user is static at runtime.
>
> b) jena-permissions - Extends the SPARQL query engine with an Op
> rewriter which allows a user-defined evalulator implementation to
> allow/deny access to a graph/triple, given a specific user/principle.
> (The specific yes/no evaluation responses are cached for the duration
> of a query/operation.)
> However, this can only applied to a single graph as it stands.
>
> c) Parse & re-write the query to e.g. scope it using a fixed set of
> "FROM" clauses. From some minimal testing (with ~200 FROM clauses)
> this does not appear to perform well (compare to a tinned query which
> explicitly restricts access via knowledge of the ontologies involved).
> I appreciate that maybe having a large list of FROM clauses is an
> anti-pattern.
>
> My questions are:
>
> 1) Does filtering to a set of subset of graphs (from a large set of
> graphs) to restrict access sounds like a sensible thing to do? (Note
> that each of these graphs would contain a set of multiple subjects -
> i.e. we are not trying filter by specific predicate/object values.)
>
> 2) Would extending either jena-fuseki-access to support the
> user-graph-list lookup dynamically OR extend jena-permissions to work
> at dataset level be sensible things to do?
>
> 3) If the answer to either of (2) is yes - I'd be interested in
> getting a better understanding of what would be involved to gauge the
> size/effort of such an extension. I have had a look codebases for the
> aforementioned projects, but my knowledge of TDB/ARQ/etc is very
> limited. (We'd potentially be interested in taking this on, time &
> priorities permitting.)
>
> I didn't know which mailing list to send this to but I thought the
> users list would probably be a better starting point.
>
> Regards,
> Vilnis
>
> --
> Vilnis Termanis
> Senior Software Developer
>
> e | vilnis.terma...@iotics.com
> www.iotics.com


Re: Trasforming and quering RDFs

2022-01-18 Thread Martynas Jusevičius
As for the query, you need to stop thinking in nested objects and
start thinking in triples.

Could the query be something like this?

SELECT  ?s ?datatype
WHERE
  { ?s  <https://schema.org/Type>  1 ;
<https://schema.org/Datatype>  ?datatype
  }

You can try it on your data on sparql.org: https://bit.ly/3nBQKnr

On Tue, Jan 18, 2022 at 10:03 PM Rinor Sefa  wrote:
>
> Hey,
>
> Regarding your problem "I can't find X in the RDF file". In your example, the 
> RDF file is in XML format, which may not be ideal for user reading. Why don't 
> you try to present the RDF in more readable formats such as N3 or Turtle?
>
> Rinor Sefa
>
> -----Original Message-
> From: Martynas Jusevičius 
> Sent: Tuesday, 18 January 2022 17:38
> To: jena-users-ml 
> Subject: Re: Trasforming and quering RDFs
>
> Hi,
>
> SPARQL is an RDF query language, so no.
>
> But there are tools that can help:
> https://github.com/AtomGraph/JSON2RDF
> https://sparql-anything.cc
>
>
> Martynas
> atomgraph.com
>
> On Tue, Jan 18, 2022 at 5:32 PM emri mbiemri  
> wrote:
> >
> > Dears,
> >
> > In order to have a more scalable knowledge base and easy to query, I
> > have converted JSON files [1] to RDFgraphs [2]. I am trying to get all
> > "Datatype" from "ViewModel" which has a "Type":1. The issue is I
> > cannot find the Type: 1 within the RDF file and if so, I don't know
> > how to query the "Datatype" which has "Type":1 within the "ViewModel" tag?
> >
> > And secondly, is there a direct way to query the JSON files easier
> > using SPARQL without having to convert them into RDF?
> >
> > Hope for your help.
> >
> >
> > [1]
> > https://github.com/iliriani/iliriangit/blob/master/CustomerForm.json
> > [2]
> > https://github.com/iliriani/iliriangit/blob/master/CustomerForm.rdf


Re: Trasforming and quering RDFs

2022-01-18 Thread Martynas Jusevičius
Hi,

SPARQL is an RDF query language, so no.

But there are tools that can help:
https://github.com/AtomGraph/JSON2RDF
https://sparql-anything.cc


Martynas
atomgraph.com

On Tue, Jan 18, 2022 at 5:32 PM emri mbiemri  wrote:
>
> Dears,
>
> In order to have a more scalable knowledge base and easy to query, I have
> converted JSON files [1] to RDFgraphs [2]. I am trying to get
> all "Datatype" from "ViewModel" which has a "Type":1. The issue is I cannot
> find the Type: 1 within the RDF file and if so, I don't know how to query
> the "Datatype" which has "Type":1 within the "ViewModel" tag?
>
> And secondly, is there a direct way to query the JSON files easier using
> SPARQL without having to convert them into RDF?
>
> Hope for your help.
>
>
> [1] https://github.com/iliriani/iliriangit/blob/master/CustomerForm.json
> [2] https://github.com/iliriani/iliriangit/blob/master/CustomerForm.rdf


Re: Using Fuseki to host IRIs / Using Fuseki as an LDP

2022-01-04 Thread Martynas Jusevičius
Hi Jakub,

What you are describing looks like Linked Data backed by an RDF triplestore.

Linked Data Templates (LDT) is a specification for this exact use case, it
defines how Linked Data requests translate to SPARQL commands.
https://atomgraph.github.io/Linked-Data-Templates/

Processor is an implemention of the LDT specification.
https://github.com/AtomGraph/Processor


Martynas
atomgraph.com

On Tue, 4 Jan 2022 at 23.16, Jakub Jałowiec 
wrote:

> Hi,
> Let's say I am hosting Apache Jena Fuseki at http//somewebsite.com and
> that
> I have a  persistent dataset at  http//
> somewebsite.com/some_persistent_dataset  . The persistent dataset contains
> a bunch of RDF triples like this one: "http//
> somewebsite.com/some_persistent_dataset/person_1
> http://xmlns.com/foaf/0.1/age 123".
>
> I'd like to host the "http//
> somewebsite.com/some_persistent_dataset/person_1"
> IRI in Fuseki. Basically what I want to achieve is to provide the user with
> a friendly HTML interface that let's them browse through the links that are
> within the root URL of the Fuseki host. Ideally, I'd like to display the
> associated properties of the IRI grouped by property type, so e.g. all
> "http//somewebsite.com/some_persistent_dataset/person_1 foaf:knows ?X"
> triples are displayed as a single list of Xs for that IRI etc.
>
> I know that I am missing here tons of technical details (e.g. how to query,
> filter & display triples associated with the given IRI) but nonetheless has
> anyone tried to do implement a richer UI in Fuseki that would support
> hosting custom IRIs in that way? That seems to be a basic feature of
> Linked-Data Platforms (LDP) but I have not yet found an LDP that is easy to
> use (please let me know if you have). I thought that implementing something
> like that might be quicker in Fuseki (plus you get the benefit of having an
> OWL reasoner for free).
>
> Does it even make sense to implement such features in Fuseki or are there
> external tools that are better in it and integrate with Jena?
>
> Best regards,
> Jakub
>


Javadoc links are broken?

2021-12-25 Thread Martynas Jusevičius
Hi,

Happy holidays!

Is it just me or most Javadoc links are broken? Or is it a Google
indexing problem?
For example:
https://jena.apache.org/documentation/javadoc/rdfconnection/org/apache/jena/rdfconnection/RDFDatasetConnection.html
https://jena.apache.org/documentation/javadoc/arq/org/apache/jena/query/DatasetAccessor.html

Those are the links I get searching Google for "RDFDatasetConnection"
and "DatasetAccessor" and both return "Not Found".

Martynas


Re: Read-only wrapper for Jena Model

2021-12-20 Thread Martynas Jusevičius
Thanks Andy.

I want to resolve imports and run inferences and then wrap it to make it
immutable so it can be passed around but not modified. The getOntology()
method is being used so OntModel is preferred to plain Model.

On Mon, 20 Dec 2021 at 13.55, Andy Seaborne  wrote:

>
>
> On 18/12/2021 22:00, Martynas Jusevičius wrote:
> > Andy,
> >
> > A follow-up question: how would you create an immutable OntModel?
> >
> >  OntModel ontModel = ModelFactory.createOntologyModel(ontModelSpec,
> modelRO);
> >
> > Would the ontModel still be mutable?
>
> Don't know.
>
> And what do you want with regards to imports?
> Won't it depend on the OntModelSpec and any inference?
>
> You could always have a read-only wrapper implementation of interface
> OntModel.
>
>  Andy
>
> >
> > On Sat, Aug 28, 2021 at 3:43 PM Andy Seaborne  wrote:
> >>
> >>
> >>
> >> On 27/08/2021 12:23, Zak Mc Kracken wrote:
> >>> Hi all,
> >>>
> >>> I have a little RDF file (describing a dataset metadata), which I want
> >>> to read in an helper class and return as a read-only view on the file.
> >>> The reason to return it as read-only is that I also keep a simple cache
> >>> of uri/Object, which is a simplified view of RDF resources in the file,
> >>> so a modifiable Model would make it impossible to keep the two aligned.
> >>>
> >>> That said, I wonder if there is some read-only wrapper for the Jena's
> >>> Model interface, something similar to Collections.unmodifiableXXX(),
> >>> which of course, would be based on the decorator pattern, with
> >>> delegation to a base Model for most of the interface methods, except
> >>> interceptors for addXXX(), which would throw
> >>> UnsupportedOperationException. Would be easy to implement it, but I
> >>> don't like to reinvent wheels, if something like that already exists.
> >>
> >> Apparently there isn't one. Not sure why not.
> >>
> >> There is a read-only graph (and a read-only DatasetGraph) so one way to
> >> create a read-only model is:
> >>
> >>   Model model = ModelFactory.createDefaultModel();
> >>   Graph graphRO = new GraphReadOnly(model.getGraph());
> >>   Model modelRO = ModelFactory.createModelForGraph(graphRO);
> >>
> >> Graph is a narrower interface do catching things here is less code.  In
> >> fact, GraphBase is read-only unless add/delete(Triple) are overwritten.
> >>
> >>   Andy
> >>>
> >>> Thanks,
> >>> Marco.
> >>>
> >>>
>


Re: Convert JSON to RDF with Jena

2021-12-20 Thread Martynas Jusevičius
https://github.com/AtomGraph/JSON2RDF

On Mon, 20 Dec 2021 at 12.29, emri mbiemri 
wrote:

> Hello all,
>
> I would like to ask if there is any method to directly convert the below
> JSON file to an RDF graph? If so then how can I do it?
>
>
> https://github.com/iliriani/iliriangit/blob/master/CustomerForm.form%20(1).xml
>


Re: Read-only wrapper for Jena Model

2021-12-18 Thread Martynas Jusevičius
Andy,

A follow-up question: how would you create an immutable OntModel?

OntModel ontModel = ModelFactory.createOntologyModel(ontModelSpec, modelRO);

Would the ontModel still be mutable?

On Sat, Aug 28, 2021 at 3:43 PM Andy Seaborne  wrote:
>
>
>
> On 27/08/2021 12:23, Zak Mc Kracken wrote:
> > Hi all,
> >
> > I have a little RDF file (describing a dataset metadata), which I want
> > to read in an helper class and return as a read-only view on the file.
> > The reason to return it as read-only is that I also keep a simple cache
> > of uri/Object, which is a simplified view of RDF resources in the file,
> > so a modifiable Model would make it impossible to keep the two aligned.
> >
> > That said, I wonder if there is some read-only wrapper for the Jena's
> > Model interface, something similar to Collections.unmodifiableXXX(),
> > which of course, would be based on the decorator pattern, with
> > delegation to a base Model for most of the interface methods, except
> > interceptors for addXXX(), which would throw
> > UnsupportedOperationException. Would be easy to implement it, but I
> > don't like to reinvent wheels, if something like that already exists.
>
> Apparently there isn't one. Not sure why not.
>
> There is a read-only graph (and a read-only DatasetGraph) so one way to
> create a read-only model is:
>
>  Model model = ModelFactory.createDefaultModel();
>  Graph graphRO = new GraphReadOnly(model.getGraph());
>  Model modelRO = ModelFactory.createModelForGraph(graphRO);
>
> Graph is a narrower interface do catching things here is less code.  In
> fact, GraphBase is read-only unless add/delete(Triple) are overwritten.
>
>  Andy
> >
> > Thanks,
> > Marco.
> >
> >


Re: [3.16.0] Repeating identical queries from SERVICE

2021-12-10 Thread Martynas Jusevičius
Moving SERVICE down in the joins seems to have helped quite a bit:

PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX  acl:  <http://www.w3.org/ns/auth/acl#>
PREFIX  lacl: <https://w3id.org/atomgraph/linkeddatahub/admin/acl/domain#>
PREFIX  foaf: <http://xmlns.com/foaf/0.1/>
PREFIX  sioc: <http://rdfs.org/sioc/ns#>

DESCRIBE ?auth
FROM 
WHERE
  {   { ?auth  acl:mode  ?Mode
  { ?auth  acl:accessTo  ?this }
UNION
  {   { ?auth  acl:accessToClass  ?Type }
UNION
  { ?auth  acl:accessToClass  ?Class .
?Type (rdfs:subClassOf)* ?Class
  }
SERVICE ?endpoint
  { { GRAPH ?g
{ ?this  a  ?Type }
}
  }
  }
{   { ?auth  acl:agent  ?agent }
  UNION
{ ?auth   acl:agentGroup  ?Group .
  ?Group  foaf:member ?agent
}
}
  }
UNION
  { ?auth  acl:mode  ?Mode
  { ?auth  acl:agentClass  foaf:Agent }
UNION
  { ?auth  acl:agentClass  ?AuthenticatedAgentClass }
  { ?auth  acl:accessTo  ?this }
UNION
  {   { ?auth  acl:accessToClass  ?Type }
UNION
  { ?auth  acl:accessToClass  ?Class .
?Type (rdfs:subClassOf)* ?Class
  }
SERVICE ?endpoint
  { { GRAPH ?g
{ ?this  a  ?Type }
}
  }
  }
  }
  }

On Sat, Dec 11, 2021 at 12:39 AM Martynas Jusevičius
 wrote:
>
> Hi,
>
> I have query that federates between 2 Fuseki instances (the "remote"
> one is fuseki-end-user):
>
> PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX  acl:  <http://www.w3.org/ns/auth/acl#>
> PREFIX  lacl: <https://w3id.org/atomgraph/linkeddatahub/admin/acl/domain#>
> PREFIX  foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX  sioc: <http://rdfs.org/sioc/ns#>
>
> DESCRIBE ?auth
> FROM 
> WHERE
>   {   { ?auth  acl:mode  acl:Read
>   { ?auth  acl:accessTo
> <https://kg.opendatahub.bz.it/queries/select-ski-resorts/> }
> UNION
>   { SERVICE <http://fuseki-end-user:3030/ds/>
>   { { GRAPH ?g
> { 
> <https://kg.opendatahub.bz.it/queries/select-ski-resorts/>
> a  ?Type
> }
> }
>   }
>   { ?auth  acl:accessToClass  ?Type }
> UNION
>   { ?auth  acl:accessToClass  ?Class .
> ?Type (rdfs:subClassOf)* ?Class
>   }
>   }
> {   { ?auth  acl:agent  rdfs:Resource }
>   UNION
> { ?auth   acl:agentGroup  ?Group .
>   ?Group  foaf:member rdfs:Resource
> }
> }
>   }
> UNION
>   { ?auth  acl:mode  acl:Read
>   { ?auth  acl:agentClass  foaf:Agent }
> UNION
>   { ?auth  acl:agentClass  rdfs:Resource }
>   { ?auth  acl:accessTo
> <https://kg.opendatahub.bz.it/queries/select-ski-resorts/> }
> UNION
>   { SERVICE <http://fuseki-end-user:3030/ds/>
>   { { GRAPH ?g
> { 
> <https://kg.opendatahub.bz.it/queries/select-ski-resorts/>
> a  ?Type
> }
> }
>   }
>   { ?auth  acl:accessToClass  ?Type }
> UNION
>   { ?auth  acl:accessToClass  ?Class .
> ?Type (rdfs:subClassOf)* ?Class
>   }
>   }
>   }
>   }
>
> What I see in the fuseki-end-user log following this query is a bunch
> (200+ in this case) identical requests with this query:
>
> SELECT  *
> WHERE
>   { GRAPH ?g
>   { <https://kg.opendatahub.bz.it/queries/select-ski-resorts/>
>   a  ?Type
>   }
>   }
>
> I understand this is due to federation and know that Fuseki does not
> cache the results, but this strikes me as terribly inefficient.
> Each SERVICE request to fuseki-end-user takes around 10 ms but 200+ of
> them add to over 2 seconds.
>
> Is there an opportunity for optimization here? Either of the query or of Jena 
> :)
>
> Martynas
> atomgraph.com


[3.16.0] Repeating identical queries from SERVICE

2021-12-10 Thread Martynas Jusevičius
Hi,

I have query that federates between 2 Fuseki instances (the "remote"
one is fuseki-end-user):

PREFIX  rdfs: 
PREFIX  acl:  
PREFIX  lacl: 
PREFIX  foaf: 
PREFIX  sioc: 

DESCRIBE ?auth
FROM 
WHERE
  {   { ?auth  acl:mode  acl:Read
  { ?auth  acl:accessTo
 }
UNION
  { SERVICE 
  { { GRAPH ?g
{ 
a  ?Type
}
}
  }
  { ?auth  acl:accessToClass  ?Type }
UNION
  { ?auth  acl:accessToClass  ?Class .
?Type (rdfs:subClassOf)* ?Class
  }
  }
{   { ?auth  acl:agent  rdfs:Resource }
  UNION
{ ?auth   acl:agentGroup  ?Group .
  ?Group  foaf:member rdfs:Resource
}
}
  }
UNION
  { ?auth  acl:mode  acl:Read
  { ?auth  acl:agentClass  foaf:Agent }
UNION
  { ?auth  acl:agentClass  rdfs:Resource }
  { ?auth  acl:accessTo
 }
UNION
  { SERVICE 
  { { GRAPH ?g
{ 
a  ?Type
}
}
  }
  { ?auth  acl:accessToClass  ?Type }
UNION
  { ?auth  acl:accessToClass  ?Class .
?Type (rdfs:subClassOf)* ?Class
  }
  }
  }
  }

What I see in the fuseki-end-user log following this query is a bunch
(200+ in this case) identical requests with this query:

SELECT  *
WHERE
  { GRAPH ?g
  { 
  a  ?Type
  }
  }

I understand this is due to federation and know that Fuseki does not
cache the results, but this strikes me as terribly inefficient.
Each SERVICE request to fuseki-end-user takes around 10 ms but 200+ of
them add to over 2 seconds.

Is there an opportunity for optimization here? Either of the query or of Jena :)

Martynas
atomgraph.com


Re: Apache Jena rules to find the minimum in a list of data property values

2021-12-05 Thread Martynas Jusevičius
You could use the CONSTRUCT query form as rules and augment your model
with the constructed triples. Something like this (untested):

PREFIX  covidepid: <>
PREFIX  foaf: 

CONSTRUCT
  {
?person a covidepid:YoungestPerson .
  }
WHERE
  { SELECT  ?house ?person ?lowestAge
WHERE
  { ?person  foaf:age   ?lowestAge ;
 covidepid:livesIn  ?house
{ SELECT  ?house (MIN(?age) AS ?lowestAge)
  WHERE
{ ?person  foaf:age   ?age ;
   covidepid:livesIn  ?house
}
  GROUP BY ?house
}
  }
  }

Fix the covidepid: namespaces before use.
Execute using QueryExecution::execConstruct:
https://jena.apache.org/documentation/javadoc/arq/org/apache/jena/query/QueryExecution.html#execConstruct(org.apache.jena.rdf.model.Model)

On Sun, Dec 5, 2021 at 2:24 PM Jakub Jałowiec
 wrote:
>
> Thanks, that solves the problem and I'll stick to it for now.
> Nonetheless, is it possible to automatically infer being an instance of the
> hypothetical class "YoungestPerson" ("the person with the lowest foaf:age
> aggregate by house") in Apache Jena as described above? Ideally, I would
> prefer to separate my conceptual/declarative model from raw data
> manipulation using SPARQL. I am new to RDF & ontologies and I am not sure
> to what extent keeping those two is possible and if it is worth to actually
> invest a lot of time into that.
>
> Best regards,
> Jakub
>
> niedz., 5 gru 2021 o 10:12 Lorenz Buehmann <
> buehm...@informatik.uni-leipzig.de> napisał(a):
>
> > Hi,
> >
> >
> > the common pattern in SPARQL is to get the aggregated value in an inner
> > query first, then in the outer query get the entity with the aggregated
> > value:
> >
> > SELECT ?house ?person ?lowestAge {
> >?person foaf:age ?lowestAge .
> >?person covidepid:livesIn ?house .
> >
> >
> > {SELECT ?house (min(?age) as ?lowestAge)
> > WHERE {
> >?person foaf:age ?age .
> >?person covidepid:livesIn ?house .
> > }
> > GROUP BY ?house}
> > }
> >
> >
> >
> > On 03.12.21 02:16, Jakub Jałowiec wrote:
> > > Hi,
> > > I would appreciate any help with the following problem. I have a bunch of
> > > (foaf:Persons, myOntology:livesIn, myOntology:Place) triples. I am trying
> > > to find the youngest person in each myOntology:Place (i.e. the person
> > with
> > > the earliest value of foaf:age for each myOntology:Place).
> > > What I've tried so far were:
> > > - OWL complex classes (Class Expression Syntax (protegeproject.github.io
> > )
> > > ) -
> > per
> > > my understanding they have too weak expressivity to express aggregates
> > > among other individuals associated with them
> > > - SPARQL query - something along those lines would work fine but I do not
> > > know how to retrieve the IRI of the youngest person:
> > >
> > >> SELECT ?house (min(?age) as ?lowestAge)
> > >> WHERE {
> > >>?person foaf:age ?age .
> > >>?person covidepid:livesIn ?house .
> > >> }
> > >> GROUP BY ?house
> > >
> > > I am curious if extraction of the lowest foaf:age value among a group of
> > > people could be achieved using Apache Jena rules. From the documentation
> > (
> > > https://jena.apache.org/documentation/inference/#rules) it seems to me
> > that
> > > the closest it gets to it is to write my custom built-in function that
> > > would do exactly that. Is that correct?
> > >
> > > Best regards,
> > > Jakub
> > >
> >


Re: [3.16.0] Implementing ReaderRIOT

2021-10-27 Thread Martynas Jusevičius
On Tue, Oct 26, 2021 at 11:52 AM Andy Seaborne  wrote:
>
>
> On 26/10/2021 09:40, Martynas Jusevičius wrote:
> > Hi,
> >
> > I'm implementing a Reader that extracts 

[3.16.0] Implementing ReaderRIOT

2021-10-26 Thread Martynas Jusevičius
Hi,

I'm implementing a Reader that extracts 

Re: [3.17.0] java.lang.NoSuchFieldError: SHACLC

2021-10-25 Thread Martynas Jusevičius
JenaSystem.init() called from ServletContextListener in webapp.

There might be a possibility I messed up the versions somehow, but
just cannot see where...

I also happened to hit
https://issues.apache.org/jira/browse/JENA-2018, so I guess I need to
get it together and upgrade to 4.2.0.

On Mon, Oct 25, 2021 at 3:57 PM Andy Seaborne  wrote:
>
> Possibly because the app touched SHACLC directly or indirectly before
> JenaSystem.init.
>
> Where is JenaSystem.init triggerd from?
>
> try JenaSystem.DEBUG_INIT = true;
>
> before any Jena code.
>
> On 25/10/2021 12:26, Martynas Jusevičius wrote:
> > Hi,
> >
> > Any suggestions as to why I'm getting this error after upgrading from
> > 3.16.0 to 3.17.?
>
> There were later changes that might be related - try 4.2.0
>
>  Andy
>
> >
> >  java.lang.NoSuchFieldError: SHACLC
> >  at 
> > org.apache.jena.shacl.compact.SHACLC.init(SHACLC.java:43)
> >  at 
> > org.apache.jena.shacl.sys.InitShacl.start(InitShacl.java:30)
> >  at
> > org.apache.jena.sys.JenaSystem.lambda$init$2(JenaSystem.java:117)
> >  at 
> > java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
> >  at 
> > org.apache.jena.sys.JenaSystem.forEach(JenaSystem.java:192)
> >  at 
> > org.apache.jena.sys.JenaSystem.forEach(JenaSystem.java:169)
> >  at org.apache.jena.sys.JenaSystem.init(JenaSystem.java:115)
> >
> > I know this usually has to do with clashing JAR versions, but as far
> > as I can see all the Jena dependencies are 3.17.0:
> >
> > [INFO] com.atomgraph:processor:jar:3.10.20-SNAPSHOT
> > [INFO] +- junit:junit:jar:4.13.1:test
> > [INFO] |  \- org.hamcrest:hamcrest-core:jar:1.3:test
> > [INFO] +- jakarta.servlet:jakarta.servlet-api:jar:4.0.3:provided
> > [INFO] +- com.atomgraph:twirl:jar:1.0.22-SNAPSHOT:compile
> > [INFO] |  +- org.apache.jena:jena-arq:jar:3.17.0:compile
> > [INFO] |  |  +- org.apache.jena:jena-core:jar:3.17.0:compile
> > [INFO] |  |  |  +- org.apache.jena:jena-base:jar:3.17.0:compile
> > [INFO] |  |  |  |  +- org.apache.jena:jena-shaded-guava:jar:3.17.0:compile
> > [INFO] |  |  |  |  +- org.apache.commons:commons-csv:jar:1.8:compile
> > [INFO] |  |  |  |  +- commons-codec:commons-codec:jar:1.15:compile
> > [INFO] |  |  |  |  +- org.apache.commons:commons-compress:jar:1.20:compile
> > [INFO] |  |  |  |  \- com.github.andrewoma.dexx:collection:jar:0.7:compile
> > [INFO] |  |  |  +- org.apache.jena:jena-iri:jar:3.17.0:compile
> > [INFO] |  |  |  \- commons-cli:commons-cli:jar:1.4:compile
> > [INFO] |  |  +- org.apache.httpcomponents:httpclient:jar:4.5.13:compile
> > [INFO] |  |  |  \- org.apache.httpcomponents:httpcore:jar:4.4.13:compile
> > [INFO] |  |  +- com.github.jsonld-java:jsonld-java:jar:0.13.2:compile
> > [INFO] |  |  |  \- commons-io:commons-io:jar:2.8.0:compile
> > [INFO] |  |  +- com.fasterxml.jackson.core:jackson-core:jar:2.11.3:compile
> > [INFO] |  |  +- 
> > com.fasterxml.jackson.core:jackson-databind:jar:2.11.3:compile
> > [INFO] |  |  |  \-
> > com.fasterxml.jackson.core:jackson-annotations:jar:2.11.3:compile
> > [INFO] |  |  +- 
> > org.apache.httpcomponents:httpclient-cache:jar:4.5.13:compile
> > [INFO] |  |  +- org.apache.thrift:libthrift:jar:0.13.0:compile
> > [INFO] |  |  |  \- javax.annotation:javax.annotation-api:jar:1.3.2:compile
> > [INFO] |  |  \- org.apache.commons:commons-lang3:jar:3.11:compile
> > [INFO] |  \- org.slf4j:slf4j-log4j12:jar:1.7.25:compile
> > [INFO] | +- org.slf4j:slf4j-api:jar:1.7.25:compile
> > [INFO] | \- log4j:log4j:jar:1.2.17:compile
> > [INFO] +- com.atomgraph:core:jar:3.0.18-SNAPSHOT:compile
> > [INFO] |  +- 
> > org.glassfish.jersey.containers:jersey-container-servlet:jar:2.30.1:compile
> > [INFO] |  |  +-
> > org.glassfish.jersey.containers:jersey-container-servlet-core:jar:2.30.1:compile
> > [INFO] |  |  +- org.glassfish.jersey.core:jersey-common:jar:2.30.1:compile
> > [INFO] |  |  |  +- 
> > jakarta.annotation:jakarta.annotation-api:jar:1.3.5:compile
> > [INFO] |  |  |  +- org.glassfish.hk2:osgi-resource-locator:jar:1.0.3:compile
> > [INFO] |  |  |  \- com.sun.activation:jakarta.activation:jar:1.2.1:compile
> > [INFO] |  |  +- org.glassfish.jersey.core:jersey-server:jar:2.30.1:compile
> > [INFO] |  |  |  +-
> > org.glassfish.jersey.media:jersey-media-jaxb:jar:2.30.1:compile
> > [INFO] |  |  |  +- 
> > jakarta.validation:jakarta.validation-api:jar:2.0.2:compile
> > [INFO] |  |  |  \- jakarta.xml.bind:jakarta.xml.bind-api:ja

[3.17.0] java.lang.NoSuchFieldError: SHACLC

2021-10-25 Thread Martynas Jusevičius
Hi,

Any suggestions as to why I'm getting this error after upgrading from
3.16.0 to 3.17.?

java.lang.NoSuchFieldError: SHACLC
at org.apache.jena.shacl.compact.SHACLC.init(SHACLC.java:43)
at org.apache.jena.shacl.sys.InitShacl.start(InitShacl.java:30)
at
org.apache.jena.sys.JenaSystem.lambda$init$2(JenaSystem.java:117)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
at org.apache.jena.sys.JenaSystem.forEach(JenaSystem.java:192)
at org.apache.jena.sys.JenaSystem.forEach(JenaSystem.java:169)
at org.apache.jena.sys.JenaSystem.init(JenaSystem.java:115)

I know this usually has to do with clashing JAR versions, but as far
as I can see all the Jena dependencies are 3.17.0:

[INFO] com.atomgraph:processor:jar:3.10.20-SNAPSHOT
[INFO] +- junit:junit:jar:4.13.1:test
[INFO] |  \- org.hamcrest:hamcrest-core:jar:1.3:test
[INFO] +- jakarta.servlet:jakarta.servlet-api:jar:4.0.3:provided
[INFO] +- com.atomgraph:twirl:jar:1.0.22-SNAPSHOT:compile
[INFO] |  +- org.apache.jena:jena-arq:jar:3.17.0:compile
[INFO] |  |  +- org.apache.jena:jena-core:jar:3.17.0:compile
[INFO] |  |  |  +- org.apache.jena:jena-base:jar:3.17.0:compile
[INFO] |  |  |  |  +- org.apache.jena:jena-shaded-guava:jar:3.17.0:compile
[INFO] |  |  |  |  +- org.apache.commons:commons-csv:jar:1.8:compile
[INFO] |  |  |  |  +- commons-codec:commons-codec:jar:1.15:compile
[INFO] |  |  |  |  +- org.apache.commons:commons-compress:jar:1.20:compile
[INFO] |  |  |  |  \- com.github.andrewoma.dexx:collection:jar:0.7:compile
[INFO] |  |  |  +- org.apache.jena:jena-iri:jar:3.17.0:compile
[INFO] |  |  |  \- commons-cli:commons-cli:jar:1.4:compile
[INFO] |  |  +- org.apache.httpcomponents:httpclient:jar:4.5.13:compile
[INFO] |  |  |  \- org.apache.httpcomponents:httpcore:jar:4.4.13:compile
[INFO] |  |  +- com.github.jsonld-java:jsonld-java:jar:0.13.2:compile
[INFO] |  |  |  \- commons-io:commons-io:jar:2.8.0:compile
[INFO] |  |  +- com.fasterxml.jackson.core:jackson-core:jar:2.11.3:compile
[INFO] |  |  +- com.fasterxml.jackson.core:jackson-databind:jar:2.11.3:compile
[INFO] |  |  |  \-
com.fasterxml.jackson.core:jackson-annotations:jar:2.11.3:compile
[INFO] |  |  +- org.apache.httpcomponents:httpclient-cache:jar:4.5.13:compile
[INFO] |  |  +- org.apache.thrift:libthrift:jar:0.13.0:compile
[INFO] |  |  |  \- javax.annotation:javax.annotation-api:jar:1.3.2:compile
[INFO] |  |  \- org.apache.commons:commons-lang3:jar:3.11:compile
[INFO] |  \- org.slf4j:slf4j-log4j12:jar:1.7.25:compile
[INFO] | +- org.slf4j:slf4j-api:jar:1.7.25:compile
[INFO] | \- log4j:log4j:jar:1.2.17:compile
[INFO] +- com.atomgraph:core:jar:3.0.18-SNAPSHOT:compile
[INFO] |  +- 
org.glassfish.jersey.containers:jersey-container-servlet:jar:2.30.1:compile
[INFO] |  |  +-
org.glassfish.jersey.containers:jersey-container-servlet-core:jar:2.30.1:compile
[INFO] |  |  +- org.glassfish.jersey.core:jersey-common:jar:2.30.1:compile
[INFO] |  |  |  +- jakarta.annotation:jakarta.annotation-api:jar:1.3.5:compile
[INFO] |  |  |  +- org.glassfish.hk2:osgi-resource-locator:jar:1.0.3:compile
[INFO] |  |  |  \- com.sun.activation:jakarta.activation:jar:1.2.1:compile
[INFO] |  |  +- org.glassfish.jersey.core:jersey-server:jar:2.30.1:compile
[INFO] |  |  |  +-
org.glassfish.jersey.media:jersey-media-jaxb:jar:2.30.1:compile
[INFO] |  |  |  +- jakarta.validation:jakarta.validation-api:jar:2.0.2:compile
[INFO] |  |  |  \- jakarta.xml.bind:jakarta.xml.bind-api:jar:2.3.2:compile
[INFO] |  |  | \-
jakarta.activation:jakarta.activation-api:jar:1.2.1:compile
[INFO] |  |  \- jakarta.ws.rs:jakarta.ws.rs-api:jar:2.1.6:compile
[INFO] |  +- org.glassfish.jersey.core:jersey-client:jar:2.30.1:compile
[INFO] |  |  \- org.glassfish.hk2.external:jakarta.inject:jar:2.6.1:compile
[INFO] |  +- org.glassfish.jersey.inject:jersey-hk2:jar:2.30.1:compile
[INFO] |  |  +- org.glassfish.hk2:hk2-locator:jar:2.6.1:compile
[INFO] |  |  |  +-
org.glassfish.hk2.external:aopalliance-repackaged:jar:2.6.1:compile
[INFO] |  |  |  +- org.glassfish.hk2:hk2-api:jar:2.6.1:compile
[INFO] |  |  |  \- org.glassfish.hk2:hk2-utils:jar:2.6.1:compile
[INFO] |  |  \- org.javassist:javassist:jar:3.25.0-GA:compile
[INFO] |  \- org.slf4j:jcl-over-slf4j:jar:1.6.4:compile
[INFO] \- org.apache.jena:jena-shacl:jar:3.17.0:compile
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time:  9.902 s
[INFO] Finished at: 2021-10-25T13:14:08+02:00
[INFO] 

Thanks,

Martynas


  1   2   3   4   5   6   7   8   >