Re: Issue with Turtle serialization

2024-08-31 Thread Andy Seaborne




On 31/08/2024 10:59, Chavdar Ivanov wrote:

Yes, but the output is a single line and has the escape characters.
If we serialize with RDFXML this does not happen (the RDF XML source is a 
multi-line string literal and the RDF XML output looks exactly the same)

But RDF XML to Turtle and Turtle to Turtle keeps the parsed escape characters 
in the output



   .set(RIOT.multilineLiterals, true)

which isn't a good name for the symbol and ought to be migrated to one 
starting "symTurtle..."


It'll print the literal with 3-quotes form starting on the current line 
unlike Java multiline literals.


Andy



От: Martynas Jusevičius 
Изпратено: 31 август 2024 г. 11:29
До: dev@jena.apache.org 
Тема: Re: Issue with Turtle serialization

I think the output is equivalent to the input - two different ways to
encode the same string.

On Sat, 31 Aug 2024 at 09.20, Chavdar Ivanov  wrote:


Hello all,

I noticed something around the support for multiline literals

It seems it is possible to have this for RDFXML writers, but for the
Turtle I am getting single line

I looked in the spec RDF 1.1 Turtle 
(w3.org)
under 2.5.1 we have this example


show:218 show:blurb '''This is a multi-line#
literal with embedded new lines and quotes
literal with many quotes (")
and up to two sequential apostrophes ('').''' .

I did a test

Input
l3igm:test
 a   sh:PropertyGroup ;
 rdfs:label  "test" ;
 sh:order0 ;
l3igm:blurb '''This is a multi-line
literal with many quotes (")
and up to two sequential apostrophes ('').''' .


When this is read and then serialized back to ttl I get this

l3igm:test  rdf:type  sh:PropertyGroup;
 rdfs:label   "test";
 sh:order 0;
 l3igm:blurb  "This is a multi-line
  \r\nliteral with many quotes (\"\"\"\"\")\r\nand up to two sequential
apostrophes ('')." .


I am using this way of writer


RDFWriter.create()
 
.base("https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftest.eu%2Ftest&data=05%7C02%7C%7C00663744f39f445c108b08dcc99f766f%7C84df9e7fe9f640afb435%7C1%7C0%7C638606933926340925%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=ZpAhUH%2BkvcBPnZ66MfQ15y5ksOuL5Zuv3tF2yn%2FwhPI%3D&reserved=0")
 .set(RIOT.symTurtleOmitBase, false)
 .set(RIOT.symTurtleIndentStyle, "wide")
 .set(RIOT.symTurtleDirectiveStyle, "rdf10")
 .lang(Lang.TURTLE)
 .source(model)
 .output(out);

Is there something I am doing wrong or this is a bug to be fixed? If a
bug, where the fix should be?

Best regards
Chavdar






Re: PR 2501 "Support for SPARQL CDTs (lists and maps as literals)"

2024-08-28 Thread Andy Seaborne




On 31/05/2024 14:32, Andy Seaborne wrote:

We have received a significant PR:

"Support for SPARQL CDTs (lists and maps as literals)"
https://github.com/apache/jena/pull/2501

from Olaf Hartig.

It is not a purely personal contribution and it looks like it should 
have a Software Grant for the specific contribution as well as a 
general ICLA from Olaf.


     Andy


Update:

ASF have received a Software Grant to cover the contribution.

The PR has been squashed and put onto branch gh2518-cdt. It is not 
up-to-date with "main". It was over 300 commits [*] and had conflicts 
when not squashed - the PR started from main sometime ago.


Further discussion:
https://github.com/apache/jena/issues/2518

Code:
https://github.com/apache/jena/tree/gh2518-cdt

Andy



[ANNOUNCE] Sergei Zuev elected as Committer

2024-08-10 Thread Andy Seaborne



The Apache Jena PMC has invited Sergei Zuev to become a committer
and we are pleased to announce that he has accepted.

Sergei has made a substantial contribution of the new ontapi module [1].

"Committer" recognizes commitment to the project. If you would like to
learn more please see

http://apache.org/foundation/how-it-works.html#roles

Please join us in welcoming Sergei as a committer.

Andy

[1] https://jena.apache.org/documentation/ontology/



Re: State of the build

2024-08-08 Thread Andy Seaborne




On 31/07/2024 19:15, Phillip Ross wrote:

Hi Andy, I'd been periodically building the main branch of jena in a
container for awhile now and only recently bumped into problems with
the fuseki e2e tests consistently not working.  I actually took the
time to patch in a maven profile which allows me to toggle e2e tests
on and off when invoking maven.  It was much quicker for me to add
this patch than trying to troubleshoot why the tests aren't working
in-container, but it would be great to run down the root cause and
have it all working properly in containers on an ongoing basis!


That's an option. Did you set  to true?

Having a docker container would be useful investment generally, not just 
for ASF Jenkins.


Andy



On Wed, Jul 31, 2024 at 9:46 AM Andy Seaborne  wrote:


The Jenkins build still isn't fixed. I have a container+maven experiment
using a simple script to build a container then running the maven build
in the container but Cypress/networking isn't working properly. If it
did work, it could be converted to a Jenkins pipeline and/or put the
script into git and run that instead of maven directly.

It's only on Jenkins and it's because the servers don't have the Cypress
prerequisites. While getting them added might be possible, I think
having a container-hosted build is itself desirable in the long term.

The Jena builds works in github actions (Linux, Windows, macOS).

For now - and temporarily! - I've made the deploy build run without
tests so development builds are available on
://repository.apache.org/content/groups/snapshots/org/apache/jena/

  Andy

On 15/07/2024 17:09, Andy Seaborne wrote:
  > Marco - thank you. That's good news.
  >
  > The plan is to have a Jenkins pipeline that makes a docker container
  > which gets cached on the ASF Jenkins nodes, not built from scratch each
  > run, then run the Jena build in that container. We can configure the
  > container appropriately.
  >
  > The Dockerfile file will be in the jena code repo.
  >
  > I don't know what we can do about other systems except document the
  > situation.
  >
  > Much reading the Jenkins documentation!
  >
  >  Andy




Re: State of the build

2024-08-08 Thread Andy Seaborne




On 08/08/2024 13:44, Andy Seaborne wrote:



On 07/08/2024 07:16, Bruno Kinoshita wrote:

Can you share the container and the other files from your experiment,
please, Andy? I can have a look why Cypress/networking isn't working.

 >
 > Thanks

The starting point is the Cypress requirements since the last version 
upgrade.


https://docs.cypress.io/guides/getting-started/installing-cypress#Linux- 
Prerequisites


The ASF Jenkins servers don't have these.

I tried building a docker container (ubuntu based, installing maven and 
also as a maven image) which installed the dependencies, then ran maven 
in the container with Jena as a mounted volume.


Using maven+jammy as the base:

https://gist.github.com/afs/e3d51fa685cd0c0a0ec5a383dc0b4b37


I've been working outside Jenkins to get a basic thing going before 
turning into a pipeline.




     Andy



If "--network host" in test:e2e
then
  runs some of test:e2e ...

[INFO] [CLIENT] yarn wait-on http://localhost:57915/$/ping && yarn run 
dev exited with code SIGTERM

[INFO] error Command failed with exit code 1.
[INFO] error Command failed with exit code 1.info Visit 
https://yarnpkg.com/en/docs/cli/run for documentation about this command.

[INFO] error Command failed with exit code 1.
[INFO] info Visit https://yarnpkg.com/en/docs/cli/run for documentation 
about this command.


This is as near to working as I got it at the time.


If no "--network host" in test:e2e

[INFO] [CLIENT]   VITE v5.3.5  ready in 488 ms
[INFO] [CLIENT]
[INFO] [CLIENT]   ➜  Local:   http://localhost:34523/
[INFO] [CLIENT]   ➜  Network: use --host to expose
[INFO] [SERVER] [nodemon] restarting due to changes...
[INFO] [SERVER] [nodemon] starting `node src/services/mock/json-server.js`
[INFO] [SERVER] JSON Server is running
... waits forever ...




Re: State of the build

2024-08-08 Thread Andy Seaborne




On 07/08/2024 07:16, Bruno Kinoshita wrote:

Can you share the container and the other files from your experiment,
please, Andy? I can have a look why Cypress/networking isn't working.

>
> Thanks

The starting point is the Cypress requirements since the last version 
upgrade.


https://docs.cypress.io/guides/getting-started/installing-cypress#Linux-Prerequisites

The ASF Jenkins servers don't have these.

I tried building a docker container (ubuntu based, installing maven and 
also as a maven image) which installed the dependencies, then ran maven 
in the container with Jena as a mounted volume.


Using maven+jammy as the base:

https://gist.github.com/afs/e3d51fa685cd0c0a0ec5a383dc0b4b37

Andy



State of the build

2024-07-31 Thread Andy Seaborne
The Jenkins build still isn't fixed. I have a container+maven experiment 
using a simple script to build a container then running the maven build 
in the container but Cypress/networking isn't working properly. If it 
did work, it could be converted to a Jenkins pipeline and/or put the 
script into git and run that instead of maven directly.


It's only on Jenkins and it's because the servers don't have the Cypress 
prerequisites. While getting them added might be possible, I think 
having a container-hosted build is itself desirable in the long term.


The Jena builds works in github actions (Linux, Windows, macOS).

For now - and temporarily! - I've made the deploy build run without 
tests so development builds are available on 
://repository.apache.org/content/groups/snapshots/org/apache/jena/


    Andy

On 15/07/2024 17:09, Andy Seaborne wrote:
> Marco - thank you. That's good news.
>
> The plan is to have a Jenkins pipeline that makes a docker container
> which gets cached on the ASF Jenkins nodes, not built from scratch each
> run, then run the Jena build in that container. We can configure the
> container appropriately.
>
> The Dockerfile file will be in the jena code repo.
>
> I don't know what we can do about other systems except document the
> situation.
>
> Much reading the Jenkins documentation!
>
>  Andy


[RESULT] [VOTE] Apache Jena 5.1.0

2024-07-18 Thread Andy Seaborne
The vote passes with 3 PMCs members (Rob, Arne, Andy) and 2 community 
votes from Marco and Øyvind.


Thank you everyone.

The more environments that get used the better as we have found out on 
this release.


Andy

On 12/07/2024 11:53, Andy Seaborne wrote:

Hi,

Here is a vote on the first release candidate for Apache Jena version 
5.1.0.


 Release Vote

This vote will be open until at least

     Tuesday 16th July, 2024 at 08:00 UTC




Re: [] Apache Jena 5.1.0

2024-07-17 Thread Andy Seaborne

Could we have another PMC vote please?

Andy

On 12/07/2024 11:53, Andy Seaborne wrote:

Hi,

Here is a vote on the first release candidate for Apache Jena version 
5.1.0.


 Release Vote

This vote will be open until at least

     Tuesday 16th July, 2024 at 08:00 UTC

Please vote to approve this release:

     [ ] +1 Approve the release
     [ ]  0 Don't care
     [ ] -1 Don't release, because ...

Everyone, not just committers, is invited to test and vote.
Please download and test the proposed release. See the checklist below.

Staging repository:
   https://repository.apache.org/content/repositories/orgapachejena-1065

Proposed dist/ area:
   https://dist.apache.org/repos/dist/dev/jena/

Keys:
   https://svn.apache.org/repos/asf/jena/dist/KEYS

Git commit (browser URL):
   https://github.com/apache/jena/commit/8cf1104383

Git Commit Hash:
   8cf11043838e312ab6ee82737de664e62d155cd1

Git Commit Tag:
   jena-5.1.0

If you expect to check the release but the time limit does not work
for you, please email within the schedule above.

 In this release

Issues in this release:

   https://s.apache.org/jena-5.1.0-issues

The major item for this release is the a new artifact jena-ontapi

It has API support for working with OWL2 as well as other ontologies. It 
is the long-term replacement for org.apache.jena.ontology.


   https://github.com/apache/jena/issues/2160

This is a contribution from @sszuev

== Also

@karolina-telicent
   Prefixes Service
     New endpoint for Fuseki to give read and read-write access to the
     prefixes of a dataset enabling lookup and modification over HTTP.
   https://github.com/apache/jena/issues/2543

Micrometer - Prometheus upgrade
   See https://github.com/micrometer-metrics/micrometer/wiki/1.13- 
Migration-Guide

   https://github.com/apache/jena/pull/2480

Value space of rdf:XMLLiteral changed to be RDF 1.1/1.2 value semantics.
   Issue https://github.com/apache/jena/issues/2430
   The value space in RDF 1.0 was different.

@TelicentPaul - Paul Gallagher
Migrating Base 64 operations from Apache Commons Codec to Util package.
   https://github.com/apache/jena/pull/2409

Balduin Landolt @BalduinLandolt
   javadoc fix for Literal.getString.
   https://github.com/apache/jena/pull/2251

ØyvindG @OyvindLGjesdal -
   https://github.com/apache/jena/pull/2121
text index fix for
   https://github.com/apache/jena/issues/2094

  @wang3820 Tong Wang
   Fix tests due to assumptions on hashmap order
   https://github.com/apache/jena/pull/2098

@thomasjtaylor Thomas J. Taylor
     Fix for NodeValueFloat
     https://github.com/apache/jena/pull/2374

@Aklakan Claus Stadler
"Incorrect JoinClassifier results with unbound values."
   https://github.com/apache/jena/issues/2412

@Aklakan Claus Stadler
   "QueryExec: abort before exec is ignored."
   https://github.com/apache/jena/issues/2394

@osi peter royal
   Track rule engine instances
   https://github.com/apache/jena/issues/2382
   https://github.com/apache/jena/pull/2432

Normalization/Canonicalization of values
   Including RDFParserBuilder.canonicalValues
     This has been reworked to provide a consistent framework
     and also guarantee the same behavior between parsing
     and TDB2 handling of values.
   https://github.com/apache/jena/issues/2557

---

Checking:

+ are the GPG signatures fine?
+ are the checksums correct?
+ is there a source archive?
+ can the source archive be built?
   (NB This requires a "mvn install" first time)
+ is there a correct LICENSE and NOTICE file in each artifact
   (both source and binary artifacts)?
+ does the NOTICE file contain all necessary attributions?
+ have any licenses of dependencies changed due to upgrades?
    if so have LICENSE and NOTICE been upgraded appropriately?
+ does the tag/commit in the SCM contain reproducible sources?




Re: [] Apache Jena 5.1.0

2024-07-15 Thread Andy Seaborne
entral:
https://repo.maven.apache.org/maven2/org/apache/httpcomponents/httpcore/4.4.13/httpcore-4.4.13.jar
Downloaded from central:
https://repo.maven.apache.org/maven2/org/apache/httpcomponents/httpcore/4.4.13/httpcore-4.4.13.jar
(329 kB at 2.4 MB/s)
Downloading from central:
https://repo.maven.apache.org/maven2/org/slf4j/slf4j-api/1.7.5/slf4j-api-1.7.5.jar
Downloaded from central:
https://repo.maven.apache.org/maven2/org/slf4j/slf4j-api/1.7.5/slf4j-api-1.7.5.jar
(26 kB at 158 kB/s)
Downloaded from central:
https://repo.maven.apache.org/maven2/com/fasterxml/jackson/core/jackson-annotations/2.13.4/jackson-annotations-2.13.4.jar
(76 kB at 445 kB/s)
Downloaded from central:
https://repo.maven.apache.org/maven2/org/apache/commons/commons-exec/1.3/commons-exec-1.3.jar
(54 kB at 322 kB/s)
Downloaded from central:
https://repo.maven.apache.org/maven2/org/apache/commons/commons-compress/1.21/commons-compress-1.21.jar
(1.0 MB at 2.8 MB/s)
Downloaded from central:
https://repo.maven.apache.org/maven2/com/fasterxml/jackson/core/jackson-databind/2.13.4.2/jackson-databind-2.13.4.2.jar
(1.5 MB at 3.2 MB/s)
[INFO] Installing node version v20.11.0
[INFO] Downloading
https://nodejs.org/dist/v20.11.0/node-v20.11.0-linux-x64.tar.gz to
/home/lotico/.m2/repository/com/github/eirslett/node/20.11.0/node-20.11.0-linux-x64.tar.gz
[INFO] No proxies configured
[INFO] No proxy was configured, downloading directly
[INFO] Unpacking
/home/lotico/.m2/repository/com/github/eirslett/node/20.11.0/node-20.11.0-linux-x64.tar.gz
into /home/lotico/git/jena/jena-fuseki2/jena-fuseki-ui/node/tmp
[INFO] Copying node binary from
/home/lotico/git/jena/jena-fuseki2/jena-fuseki-ui/node/tmp/node-v20.11.0-linux-x64/bin/node
to /home/lotico/git/jena/jena-fuseki2/jena-fuseki-ui/node/node
[INFO] Installed node locally.
[INFO] Installing Yarn version v1.22.17
[INFO] Downloading
https://github.com/yarnpkg/yarn/releases/download/v1.22.17/yarn-v1.22.17.tar.gz
to
/home/lotico/.m2/repository/com/github/eirslett/yarn/1.22.17/yarn-1.22.17.tar.gz
[INFO] No proxies configured
[INFO] No proxy was configured, downloading directly
[INFO] Unpacking
/home/lotico/.m2/repository/com/github/eirslett/yarn/1.22.17/yarn-1.22.17.tar.gz
into /home/lotico/git/jena/jena-fuseki2/jena-fuseki-ui/node/yarn
[INFO] Installed Yarn locally.
[INFO]
[INFO] --- frontend:1.15.0:yarn (yarn install) @ jena-fuseki-ui ---
[INFO] Running 'yarn install --frozen-lockfile' in
/home/lotico/git/jena/jena-fuseki2/jena-fuseki-ui
[INFO] yarn install v1.22.17
[INFO] [1/4] Resolving packages...
[INFO] [2/4] Fetching packages...
[INFO] [3/4] Linking dependencies...
[INFO] warning " > @cypress/code-coverage@3.12.41" has unmet peer
dependency "@babel/core@^7.0.1".
[INFO] warning " > @cypress/code-coverage@3.12.41" has unmet peer
dependency "@babel/preset-env@^7.0.0".
[INFO] warning " > @cypress/code-coverage@3.12.41" has unmet peer
dependency "babel-loader@^8.3 || ^9".
[INFO] warning " > @cypress/code-coverage@3.12.41" has unmet peer
dependency "webpack@^4 || ^5".
[INFO] warning "@cypress/code-coverage > @cypress/webpack-preprocessor@6.0.0"
has unmet peer dependency "@babel/core@^7.0.1".
[INFO] warning "@cypress/code-coverage > @cypress/webpack-preprocessor@6.0.0"
has unmet peer dependency "@babel/preset-env@^7.0.0".
[INFO] warning "@cypress/code-coverage > @cypress/webpack-preprocessor@6.0.0"
has unmet peer dependency "babel-loader@^8.3 || ^9".
[INFO] warning "@cypress/code-coverage > @cypress/webpack-preprocessor@6.0.0"
has unmet peer dependency "webpack@^4 || ^5".
[INFO] [4/4] Building fresh packages...





On Sun, Jul 14, 2024 at 4:36 PM Andy Seaborne  wrote:


Hi Marco,

The list of prerequisites for Cypress is list at:


https://docs.cypress.io/guides/getting-started/installing-cypress#Linux-Prerequisites

They are all an single "apt install". if you have moment, I'd be
grateful if you could try that.


I installed nodejs with the exact version you have, and the build still
went fine so it isn't looking like node version leakage.

  Andy

On 13/07/2024 20:41, Marco Neumann wrote:

Hi Andy,

I don't recall the cypress related error but I had to install quite a few
new packages (cross-env, mocha, run-script-os ) to get the tests over the
line.

Many of the new features go beyond my current use cases for jena so it

will

take a while before I encounter any issues, in particular if they don't
interfere with use of the base system.

Marco



On Sat, Jul 13, 2024 at 4:02 PM Andy Seaborne  wrote:




On 13/07/2024 10:56, Marco Neumann wrote:

[X] +1

but not deployed yet

had to install xvfb, pnpm and mocha etc to get it pass the Apache Jena

-

Fuseki UI tests

mvn clean install



Hi Marco - thank you for trying it out.

Did the build output also say "[INFO] [TESTS] Cypress failed to start.

Re: [] Apache Jena 5.1.0

2024-07-14 Thread Andy Seaborne

Hi Marco,

The list of prerequisites for Cypress is list at:

https://docs.cypress.io/guides/getting-started/installing-cypress#Linux-Prerequisites

They are all an single "apt install". if you have moment, I'd be 
grateful if you could try that.



I installed nodejs with the exact version you have, and the build still 
went fine so it isn't looking like node version leakage.


Andy

On 13/07/2024 20:41, Marco Neumann wrote:

Hi Andy,

I don't recall the cypress related error but I had to install quite a few
new packages (cross-env, mocha, run-script-os ) to get the tests over the
line.

Many of the new features go beyond my current use cases for jena so it will
take a while before I encounter any issues, in particular if they don't
interfere with use of the base system.

Marco



On Sat, Jul 13, 2024 at 4:02 PM Andy Seaborne  wrote:




On 13/07/2024 10:56, Marco Neumann wrote:

[X] +1

but not deployed yet

had to install xvfb, pnpm and mocha etc to get it pass the Apache Jena -
Fuseki UI tests

mvn clean install



Hi Marco - thank you for trying it out.

Did the build output also say "[INFO] [TESTS] Cypress failed to start."
earlier in the log?

"""
[INFO] [TESTS] [STARTED] Task without title.
[INFO] [TESTS] [FAILED] Cypress failed to start.
[INFO] [TESTS] [FAILED]
[INFO] [TESTS] [FAILED] This may be due to a missing library or
dependency. https://on.cypress.io/required-dependencies
[INFO] [TESTS] [FAILED]
"""

The ASF Jenkins instance is experiencing this error and it appears to be
due to dependench related. I this not clear why it has started (the PR
where it started doesn't on the surface seem to a likely looking change).

  > Node v18.19.1

Hmm - the POM file says

  v20.11.0

so that seems to only apply to the build, not running tests.


It could be that the Jena build is now running on older environments -
the ASF Jenkins build fleet has about 30 Ubuntu servers (!!) for general
use and from mixed donations.

The problem is noted in the pre-release thread:
https://lists.apache.org/thread/drfj9lmh65gcj6d609tx52qdb1n4wrxc

and because the build is passing on github actions and local machine, it
seems better to get the release out and come back and fix it.
Jena ought to move to using Jenkins pipeline with a docker image so that
the build to get isolation. We fairly recently had problems with nodejs
versions on older build servers but in that case nodejs was updated.

  Andy


[ERROR] Failed to execute goal
com.github.eirslett:frontend-maven-plugin:1.15.0:yarn (yarn run test:e2e)
on project jena-fuseki-ui: Failed to run task: 'yarn run test:e2e'

failed.

org.apache.commons.exec.ExecuteException: Process exited with an error: 1
(Exit value: 1) -> [Help 1]

[INFO] Reactor Summary for Apache Jena 5.1.0:
[INFO]
[INFO] Apache Jena  SUCCESS [
   9.543 s]
[INFO] Apache Jena - IRI .. SUCCESS [
   2.671 s]
[INFO] Apache Jena - Base . SUCCESS [
   3.686 s]
[INFO] Apache Jena - Core . SUCCESS [
28.747 s]
[INFO] Apache Jena - ARQ .. SUCCESS [
25.787 s]
[INFO] Apache Jena - ONTAPI ... SUCCESS [
10.581 s]
[INFO] Apache Jena - SHACL  SUCCESS [
   3.326 s]
[INFO] Apache Jena - ShEx . SUCCESS [
   3.885 s]
[INFO] Apache Jena - RDF Patch  SUCCESS [
   2.371 s]
[INFO] Apache Jena - RDF Connection ... SUCCESS [
   2.447 s]
[INFO] Apache Jena - DBOE Database Operation Environment .. SUCCESS [
   0.066 s]
[INFO] Apache Jena - DBOE Base  SUCCESS [
   2.156 s]
[INFO] Apache Jena - DBOE Transactions  SUCCESS [
   1.825 s]
[INFO] Apache Jena - DBOE Indexes . SUCCESS [
   1.079 s]
[INFO] Apache Jena - DBOE Index test suite  SUCCESS [
   0.320 s]
[INFO] Apache Jena - DBOE Transactional Datastructures  SUCCESS [
20.250 s]
[INFO] Apache Jena - DBOE Storage . SUCCESS [
   1.694 s]
[INFO] Apache Jena - TDB1 (Native Triple Store) ... SUCCESS [
   7.363 s]
[INFO] Apache Jena - TDB2 (Native Triple Store) ... SUCCESS [
   6.352 s]
[INFO] Apache Jena - Libraries POM  SUCCESS [
   0.243 s]
[INFO] Apache Jena - Command line tools ... SUCCESS [
   2.839 s]
[INFO] Apache Jena - SPARQL Text Search ... SUCCESS [
   5.312 s]
[INFO] Apache Jena - Fuseki - A SPARQL 1.1 Server . SUCCESS [
   0.024 s]
[INFO] Apache Jena - Fuseki Core Engine ... SUCCESS [
   5.540 s]
[INFO] Apache Jena - Fuseki UI  FAILURE [
   9.412 s]
[INFO] Apache Jena - Fuseki Data Access Control ... SKIPPED
[INFO] Apache J

Re: [] Apache Jena 5.1.0

2024-07-13 Thread Andy Seaborne




On 13/07/2024 10:56, Marco Neumann wrote:

[X] +1

but not deployed yet

had to install xvfb, pnpm and mocha etc to get it pass the Apache Jena -
Fuseki UI tests

mvn clean install



Hi Marco - thank you for trying it out.

Did the build output also say "[INFO] [TESTS] Cypress failed to start." 
earlier in the log?


"""
[INFO] [TESTS] [STARTED] Task without title.
[INFO] [TESTS] [FAILED] Cypress failed to start.
[INFO] [TESTS] [FAILED]
[INFO] [TESTS] [FAILED] This may be due to a missing library or 
dependency. https://on.cypress.io/required-dependencies

[INFO] [TESTS] [FAILED]
"""

The ASF Jenkins instance is experiencing this error and it appears to be 
due to dependench related. I this not clear why it has started (the PR 
where it started doesn't on the surface seem to a likely looking change).


> Node v18.19.1

Hmm - the POM file says

v20.11.0

so that seems to only apply to the build, not running tests.


It could be that the Jena build is now running on older environments - 
the ASF Jenkins build fleet has about 30 Ubuntu servers (!!) for general 
use and from mixed donations.


The problem is noted in the pre-release thread:
https://lists.apache.org/thread/drfj9lmh65gcj6d609tx52qdb1n4wrxc

and because the build is passing on github actions and local machine, it 
seems better to get the release out and come back and fix it.
Jena ought to move to using Jenkins pipeline with a docker image so that 
the build to get isolation. We fairly recently had problems with nodejs 
versions on older build servers but in that case nodejs was updated.


Andy


[ERROR] Failed to execute goal
com.github.eirslett:frontend-maven-plugin:1.15.0:yarn (yarn run test:e2e)
on project jena-fuseki-ui: Failed to run task: 'yarn run test:e2e' failed.
org.apache.commons.exec.ExecuteException: Process exited with an error: 1
(Exit value: 1) -> [Help 1]

[INFO] Reactor Summary for Apache Jena 5.1.0:
[INFO]
[INFO] Apache Jena  SUCCESS [
  9.543 s]
[INFO] Apache Jena - IRI .. SUCCESS [
  2.671 s]
[INFO] Apache Jena - Base . SUCCESS [
  3.686 s]
[INFO] Apache Jena - Core . SUCCESS [
28.747 s]
[INFO] Apache Jena - ARQ .. SUCCESS [
25.787 s]
[INFO] Apache Jena - ONTAPI ... SUCCESS [
10.581 s]
[INFO] Apache Jena - SHACL  SUCCESS [
  3.326 s]
[INFO] Apache Jena - ShEx . SUCCESS [
  3.885 s]
[INFO] Apache Jena - RDF Patch  SUCCESS [
  2.371 s]
[INFO] Apache Jena - RDF Connection ... SUCCESS [
  2.447 s]
[INFO] Apache Jena - DBOE Database Operation Environment .. SUCCESS [
  0.066 s]
[INFO] Apache Jena - DBOE Base  SUCCESS [
  2.156 s]
[INFO] Apache Jena - DBOE Transactions  SUCCESS [
  1.825 s]
[INFO] Apache Jena - DBOE Indexes . SUCCESS [
  1.079 s]
[INFO] Apache Jena - DBOE Index test suite  SUCCESS [
  0.320 s]
[INFO] Apache Jena - DBOE Transactional Datastructures  SUCCESS [
20.250 s]
[INFO] Apache Jena - DBOE Storage . SUCCESS [
  1.694 s]
[INFO] Apache Jena - TDB1 (Native Triple Store) ... SUCCESS [
  7.363 s]
[INFO] Apache Jena - TDB2 (Native Triple Store) ... SUCCESS [
  6.352 s]
[INFO] Apache Jena - Libraries POM  SUCCESS [
  0.243 s]
[INFO] Apache Jena - Command line tools ... SUCCESS [
  2.839 s]
[INFO] Apache Jena - SPARQL Text Search ... SUCCESS [
  5.312 s]
[INFO] Apache Jena - Fuseki - A SPARQL 1.1 Server . SUCCESS [
  0.024 s]
[INFO] Apache Jena - Fuseki Core Engine ... SUCCESS [
  5.540 s]
[INFO] Apache Jena - Fuseki UI  FAILURE [
  9.412 s]
[INFO] Apache Jena - Fuseki Data Access Control ... SKIPPED
[INFO] Apache Jena - Fuseki Server Main ... SKIPPED
[INFO] Apache Jena - Fuseki Server Jar  SKIPPED
[INFO] Apache Jena - Fuseki Webapp  SKIPPED
[INFO] Apache Jena - Fuseki WAR File .. SKIPPED
[INFO] Apache Jena - Fuseki Server Standalone Jar . SKIPPED
[INFO] Apache Jena - Fuseki Docker Tools .. SKIPPED
[INFO] Apache Jena - Fuseki Binary Distribution ... SKIPPED
[INFO] Apache Jena - GeoSPARQL Engine . SKIPPED
[INFO] Apache Jena - Fuseki with GeoSPARQL Engine . SKIPPED
[INFO] Apache Jena - Integration Testing .. SKIPPED
[INFO] Apache Jena - Benchmark Suite .. SKIPPED
[INFO] Apache Jena - Benchmarks Shaded Jena 4.8.0 . SKIPPED
[INFO] Apache Jena - Benchmarks JMH ... SKIPPED
[INFO] Apache Jena - Distribution . SKIPPED
[INFO] Apache Jena - Securi

Re: [] Apache Jena 5.1.0

2024-07-13 Thread Andy Seaborne

Hi Øyvind,

Thank you for reporting that. It'll be correct in the [ANN] email.

On 12/07/2024 22:15, Øyvind Gjesdal wrote:

I think some of the release notes are in Jena 5.0.0:

Balduin Landolt @BalduinLandolt (merged into main on feb 6)
javadoc fix for Literal.getString.
https://github.com/apache/jena/pull/2251

ØyvindG @OyvindLGjesdal  (merged into main on feb 8)
https://github.com/apache/jena/pull/2121
text index fix for
https://github.com/apache/jena/issues/2094


Re: Towards Jena 5.1.0

2024-07-12 Thread Andy Seaborne




On 10/07/2024 19:48, Andy Seaborne wrote:
...

1. "Cypress failed to start"

"This may be due to a missing library or dependency."

It only happens on Jenkins and the build is OK on github actions and my 
local machine. I'm guessing it is a version dependency issue.


Starts at PR #2410

https://github.com/apache/jena/pull/2410/files


Jenkins is still having bad days.

I've deployed 5.2.0-SNAPSHOT directly so there is an entry for the 5.2.0 
development cycle.


Andy




Re: [VOTE] Apache Jena 5.1.0

2024-07-12 Thread Andy Seaborne

[x] +1 Approve the release

On 12/07/2024 11:53, Andy Seaborne wrote:

Hi,

Here is a vote on the first release candidate for Apache Jena version 
5.1.0.


 Release Vote

This vote will be open until at least

     Tuesday 16th July, 2024 at 08:00 UTC

Please vote to approve this release:

     [ ] +1 Approve the release
     [ ]  0 Don't care
     [ ] -1 Don't release, because ...

Everyone, not just committers, is invited to test and vote.
Please download and test the proposed release. See the checklist below.


[VOTE] Apache Jena 5.1.0

2024-07-12 Thread Andy Seaborne

Hi,

Here is a vote on the first release candidate for Apache Jena version 5.1.0.

 Release Vote

This vote will be open until at least

Tuesday 16th July, 2024 at 08:00 UTC

Please vote to approve this release:

[ ] +1 Approve the release
[ ]  0 Don't care
[ ] -1 Don't release, because ...

Everyone, not just committers, is invited to test and vote.
Please download and test the proposed release. See the checklist below.

Staging repository:
  https://repository.apache.org/content/repositories/orgapachejena-1065

Proposed dist/ area:
  https://dist.apache.org/repos/dist/dev/jena/

Keys:
  https://svn.apache.org/repos/asf/jena/dist/KEYS

Git commit (browser URL):
  https://github.com/apache/jena/commit/8cf1104383

Git Commit Hash:
  8cf11043838e312ab6ee82737de664e62d155cd1

Git Commit Tag:
  jena-5.1.0

If you expect to check the release but the time limit does not work
for you, please email within the schedule above.

 In this release

Issues in this release:

  https://s.apache.org/jena-5.1.0-issues

The major item for this release is the a new artifact jena-ontapi

It has API support for working with OWL2 as well as other ontologies. It 
is the long-term replacement for org.apache.jena.ontology.


  https://github.com/apache/jena/issues/2160

This is a contribution from @sszuev

== Also

@karolina-telicent
  Prefixes Service
New endpoint for Fuseki to give read and read-write access to the
prefixes of a dataset enabling lookup and modification over HTTP.
  https://github.com/apache/jena/issues/2543

Micrometer - Prometheus upgrade
  See 
https://github.com/micrometer-metrics/micrometer/wiki/1.13-Migration-Guide

  https://github.com/apache/jena/pull/2480

Value space of rdf:XMLLiteral changed to be RDF 1.1/1.2 value semantics.
  Issue https://github.com/apache/jena/issues/2430
  The value space in RDF 1.0 was different.

@TelicentPaul - Paul Gallagher
Migrating Base 64 operations from Apache Commons Codec to Util package.
  https://github.com/apache/jena/pull/2409

Balduin Landolt @BalduinLandolt
  javadoc fix for Literal.getString.
  https://github.com/apache/jena/pull/2251

ØyvindG @OyvindLGjesdal -
  https://github.com/apache/jena/pull/2121
text index fix for
  https://github.com/apache/jena/issues/2094

 @wang3820 Tong Wang
  Fix tests due to assumptions on hashmap order
  https://github.com/apache/jena/pull/2098

@thomasjtaylor Thomas J. Taylor
Fix for NodeValueFloat
https://github.com/apache/jena/pull/2374

@Aklakan Claus Stadler
"Incorrect JoinClassifier results with unbound values."
  https://github.com/apache/jena/issues/2412

@Aklakan Claus Stadler
  "QueryExec: abort before exec is ignored."
  https://github.com/apache/jena/issues/2394

@osi peter royal
  Track rule engine instances
  https://github.com/apache/jena/issues/2382
  https://github.com/apache/jena/pull/2432

Normalization/Canonicalization of values
  Including RDFParserBuilder.canonicalValues
This has been reworked to provide a consistent framework
and also guarantee the same behavior between parsing
and TDB2 handling of values.
  https://github.com/apache/jena/issues/2557

---

Checking:

+ are the GPG signatures fine?
+ are the checksums correct?
+ is there a source archive?
+ can the source archive be built?
  (NB This requires a "mvn install" first time)
+ is there a correct LICENSE and NOTICE file in each artifact
  (both source and binary artifacts)?
+ does the NOTICE file contain all necessary attributions?
+ have any licenses of dependencies changed due to upgrades?
   if so have LICENSE and NOTICE been upgraded appropriately?
+ does the tag/commit in the SCM contain reproducible sources?


Re: Towards Jena 5.1.0

2024-07-10 Thread Andy Seaborne

A bit bumpy!

The current state is that the codebase is now ready

There are three issues to do with the UI part of the build:

1. "Cypress failed to start"

"This may be due to a missing library or dependency."

It only happens on Jenkins and the build is OK on github actions and my 
local machine. I'm guessing it is a version dependency issue.


Starts at PR #2410

https://github.com/apache/jena/pull/2410/files


2. Sass deprecation warnings - these will need addressing but are not a 
problem. Starts at PR #2573 (dependabot)



3. In the unmerged PR 2574 / local build only
test:unit failure.
This is a blocker. The upgrade is of vite-test from 1.6.0 to 2.0.1 (OK - 
so major version jump)


https://github.com/apache/jena/pull/2574 from dependabot.
(this has been marked "draft")


Given all this, I think we should release without point 3 (PR 2574) 
which is only about the build and test.


    Andy

On 17/06/2024 14:54, Andy Seaborne wrote:

Jena 5.0.0 was released March 16th.
It's about time for Jena 5.1.0.

The main feature for 5.1.0 is the new ontapi module which includes a 
Java API for working with OWL2.


Issues: https://github.com/apache/jena/issues/2160
Code:   https://github.com/apache/jena/tree/main/jena-ontapi

Draft documentation:

https://github.com/apache/jena-site/blob/jena-next/source/documentation/ontology/__index.md

Issues closed in Jena 5.1.0 so far
   https://s.apache.org/jena-5.1.0-issues
45 issues

--

Beyond Jena 5.1.0:

There are several PRs in the backlog. A major one is the SPARQL 
extensions for lists and maps as literals. CDT = "Composite Datatype 
Literals". It is not quite ready to merge.


The website:
   https://github.com/awslabs/SPARQL-CDTs

Issue: https://github.com/apache/jena/issues/2518
PR:    https://github.com/apache/jena/pull/2501

which is a contribution from AWSlabs for an implementation of this.
This would be "experimental" meaning it is subject to change.  There 
should be no impact if the feature isn't used.


SPARQL and RDF features do need a way to get from solid ideaS to 
practical experience from real verification.


It has also been submitted as SPARQL change (SEP-0009) and there is also 
an implementation in Attean (Perl based)


   https://github.com/kasei/attean

I think it is better to not have two major items in a release so the 
suggestion is release Jena 5.1.0 and have a shorter (1-2 month) cycle 
for Jena 5.2.0 (if that works out).


== Current state

The Jenkins and github actions are all passing.

There is a backlog of PRs.


Towards Jena 5.1.0

2024-06-17 Thread Andy Seaborne

Jena 5.0.0 was released March 16th.
It's about time for Jena 5.1.0.

The main feature for 5.1.0 is the new ontapi module which includes a 
Java API for working with OWL2.


Issues: https://github.com/apache/jena/issues/2160
Code:   https://github.com/apache/jena/tree/main/jena-ontapi

Draft documentation:

https://github.com/apache/jena-site/blob/jena-next/source/documentation/ontology/__index.md

Issues closed in Jena 5.1.0 so far
  https://s.apache.org/jena-5.1.0-issues
45 issues

--

Beyond Jena 5.1.0:

There are several PRs in the backlog. A major one is the SPARQL 
extensions for lists and maps as literals. CDT = "Composite Datatype 
Literals". It is not quite ready to merge.


The website:
  https://github.com/awslabs/SPARQL-CDTs

Issue: https://github.com/apache/jena/issues/2518
PR:https://github.com/apache/jena/pull/2501

which is a contribution from AWSlabs for an implementation of this.
This would be "experimental" meaning it is subject to change.  There 
should be no impact if the feature isn't used.


SPARQL and RDF features do need a way to get from solid ideaS to 
practical experience from real verification.


It has also been submitted as SPARQL change (SEP-0009) and there is also 
an implementation in Attean (Perl based)


  https://github.com/kasei/attean

I think it is better to not have two major items in a release so the 
suggestion is release Jena 5.1.0 and have a shorter (1-2 month) cycle 
for Jena 5.2.0 (if that works out).


== Current state

The Jenkins and github actions are all passing.

There is a backlog of PRs.


Re: Legal question: Is non-commercial licensing (like in CC BY-NC-SA) okay for test resources?

2024-06-10 Thread Andy Seaborne




On 09/06/2024 14:55, Arne Bernhardt wrote:

Hi,

the ENTSO-E published CIM/CGMES test data on CIM Conformity and
Interoperability
.

The Test Configurations v3.0.2

are
published under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0
International License .

Is it possible to use the test data in test resources and unit tests within
Apache Jena? 


As in including in the source codebase?
Only if license compatible.

https://www.apache.org/legal/resolved.html

Implementing standards that aren't license compatible is not an issue 
per se. For example, W3C standards are not open source.

The W3C document license says:

"""
 the publication of derivative works of this document for use as a 
technical specification is expressly prohibited.

"""

That's a restriction of use.  Not open source.

It's common in standards - they should be "closed" and derived works 
could be competing.


W3C tests are open source - the W3C Software license is BSD-type - and 
Jena includes copies of all tests (relying on the build pulling things 
over HTTP over the open web is, even today, fragile ... and the remove 
may change).



Apache Jena itself should be open to commercial usage, but
could the tests be considered non-commercial?

Should I contact legal-disc...@apache.org?


If in doubt, that would be good idea. The project can have opinions but 
that isn't definitive.



There is also the option to contact ENTSO-E directly.

Note:
The RDFS and SHACL contained in Application Profiles v3.0.1

are
licensed under the Apache License Version 2.0, so the standard itself is
open.


The tests are open.

That zip has an Apache License LICENSE so it is good IF there are no 
other statements on certain files to the contrary (not uncommon :-()


The Readme.docs says:

"""
ENTSO-E neither warrants, nor represents that the use of the test 
configurations (models), documents and application profiles will not 
infringe the rights of third parties. Any use of the test configurations 
(models), documents and application profiles shall include a reference 
to ENTSO-E. ENTSO-E web site is the only official source of information 
related to these test configurations (models), documents and application 
profiles.​

"""

I would have thought
"shall include a reference to ENTSO-E" puts a requirement on Jena to 
include it in Jena's NOTICE. Not sure if that means it is still "Apache 
License".


I haven't been through the whole zip.

Andy



- Arne



PR 2501 "Support for SPARQL CDTs (lists and maps as literals)"

2024-05-31 Thread Andy Seaborne

We have received a significant PR:

"Support for SPARQL CDTs (lists and maps as literals)"
https://github.com/apache/jena/pull/2501

from Olaf Hartig.

It is not a purely personal contribution and it looks like it should 
have a Software Grant for the specific contribnution as well as a 
general ICLA from Olaf.


Andy


Re: SPDX, Rats and Configs --- oh my

2024-05-18 Thread Andy Seaborne




On 17/05/2024 07:42, Claude Warren wrote:

Greetings,

I saw a note from Andy awhile back about exploring SPDX tag usage in Rat.
I am currently working on Rat to make it much more configurable.  Recent
changes include the ability to detect SPDX license statements and an
upcoming change that will check licenses found in archives (e.g. jars in a
lib dir).

My question is, Is there something, some knob or lever or action, that
could be added to Rat, that would help process the Jena releases?


What I have been wondering is whether we should add the SPDX license type

SPDX-License-Identifier: Apache-2.0

I don't know what common practice is currently across Apache projects.

The thing to avoid is repeated churn, and especially removing some new 
piece of information or feature when a few downstream might have started 
using the information.


c.f. CycloneDX and or SPDX SBOM.


Is there any such change for any other project you are working on?


At £job, we're in a phase of developing checking workflows and if we 
find anything for dependencies (Jena is a dependency) that would improve 
anything, we'll feed it back.




Note: there have been lots of changes.  Defining licenses is now simply
including a configuration file, licenses can be excluded, Copyright and
SPDX specific tests can be added to license checks.  Checks can be either
required or prohibited.  Checks can be grouped with "all" or "any".


Jena uses "build-files/rat-exclusions.txt" which has improved managing 
RAT configuration from when it was in the POM.


It does sound there are more RAT changes which can be used to do a 
better job for the W3C test files which would be nice.




Any input would be appreciated.
Claude



Re: [] Accept jena-ontapi into the Apache Jena codebase

2024-05-08 Thread Andy Seaborne

OWL2 API support now in development builds.

Andy


[RESULT][VOTE][LAZY] Accept jena-ontapi into the Apache Jena codebase

2024-05-08 Thread Andy Seaborne

The VOTE passes with +1's from Arne, Claude and Andy.

I'll merge the PR.

Andy

On 03/05/2024 18:17, Andy Seaborne wrote:

This is a lazy consensus VOTE to accept a PR for a new module
jena-ontapi that enhances Apache Jena with OWL2 support.

https://github.com/apache/jena/pull/2420
from @sszuev

This vote is open until

    05:00am UTC Wednesday 8th May 2024

The code in this PR is within a new jena-ontapi module except for two 
changes in the top Jena POM to add the module into the build.


There is no change to org.apache.jena.ontology code in jena-core.

There are no new dependencies for downstream user/application code.

PR README.md (temporary link)

https://github.com/apache/jena/blob/f1584c53c9834a38248dbfca1121214c8cdff4d8/jena-ontapi/README.md

Previous discussion:
https://lists.apache.org/thread/yr9q394fssr0mvgxvrskynmhjlz0g33x

     Andy


Re: "ava.lang.IllegalStateException: There is already a Shiro environment associated with the current ServletContext. " when starting latest Fuseki (from src build) on Tomcat 10

2024-05-03 Thread Andy Seaborne




On 03/05/2024 00:38, Phillip Rhodes wrote:

Hi Jena team, just FYI...

I just tried building the latest from "main" from the Github repo
(plus the contents of PR #2445 -
https://github.com/apache/jena/pull/2445) and when I try to launch the
Fuseki war file in Tomcat 10 (with Java 17) I get this business:


PR2445 is merged and there are devleopment builds:

https://repository.apache.org/content/groups/snapshots/org/apache/jena/jena-fuseki-war/5.1.0-SNAPSHOT/

I'm trying with:

jena-fuseki-war-5.1.0-20240503.102541-25.war



02-May-2024 23:30:50.853 INFO [main]
org.apache.catalina.core.ApplicationContext.log Initializing Shiro
environment
02-May-2024 23:30:51.069 SEVERE [main]
org.apache.catalina.core.StandardContext.listenerStart Exception
sending context initialized event to list
ener instance of class [org.apache.shiro.ee.listeners.EnvironmentLoaderListener]
java.lang.IllegalStateException: There is already a Shiro
environment associated with the current ServletContext.  Check if you
have mult
iple EnvironmentLoader* definitions in your web.xml!


I get the same error.

Clean Tomcat 10.1.23 installed from Tomcat down, running from the 
command line, not running as a service.



at
org.apache.shiro.web.env.EnvironmentLoader.initEnvironment(EnvironmentLoader.java:132)
at
org.apache.shiro.ee.listeners.EnvironmentLoaderListener.contextInitialized(EnvironmentLoaderListener.java:76)
at
org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4453)
at

...



Not sure if this is a "Phil is being stupid" thing,


No.


or a "this is
expected / known issue" thing, or something that actually needs
attention.


Looks like it


But in case it is something, I thought I'd point it out.


Thank you!

Andy




Cheers,


Phil


Re: [VOTE][LAZY] Accept jena-ontapi into the Apache Jena codebase

2024-05-03 Thread Andy Seaborne

+1

On 03/05/2024 18:17, Andy Seaborne wrote:

This is a lazy consensus VOTE to accept a PR for a new module
jena-ontapi that enhances Apache Jena with OWL2 support.

https://github.com/apache/jena/pull/2420
from @sszuev

This vote is open until

    05:00am UTC Wednesday 8th May 2024

The code in this PR is within a new jena-ontapi module except for two 
changes in the top Jena POM to add the module into the build.


There is no change to org.apache.jena.ontology code in jena-core.

There are no new dependencies for downstream user/application code.

PR README.md (temporary link)

https://github.com/apache/jena/blob/f1584c53c9834a38248dbfca1121214c8cdff4d8/jena-ontapi/README.md

Previous discussion:
https://lists.apache.org/thread/yr9q394fssr0mvgxvrskynmhjlz0g33x

     Andy


[VOTE][LAZY] Accept jena-ontapi into the Apache Jena codebase

2024-05-03 Thread Andy Seaborne

This is a lazy consensus VOTE to accept a PR for a new module
jena-ontapi that enhances Apache Jena with OWL2 support.

https://github.com/apache/jena/pull/2420
from @sszuev

This vote is open until

   05:00am UTC Wednesday 8th May 2024

The code in this PR is within a new jena-ontapi module except for two 
changes in the top Jena POM to add the module into the build.


There is no change to org.apache.jena.ontology code in jena-core.

There are no new dependencies for downstream user/application code.

PR README.md (temporary link)

https://github.com/apache/jena/blob/f1584c53c9834a38248dbfca1121214c8cdff4d8/jena-ontapi/README.md

Previous discussion:
https://lists.apache.org/thread/yr9q394fssr0mvgxvrskynmhjlz0g33x

Andy


Re: Contribution of jena-ontapi module.

2024-04-28 Thread Andy Seaborne




On 20/04/2024 17:07, Andy Seaborne wrote:

PMC,

We've received a contribution which includes OWL2 support.

https://github.com/apache/jena/issues/2160 "Support for OWL2"
https://github.com/apache/jena/pull/2420 "jena-ontapi module"

README:
https://github.com/apache/jena/pull/2420/files#diff-c8f3f6da514f1c8fd82f305b56cfac2b95784632984810e01943bfd16befe82a

The contribution is self-contained - it doesn't alter any other part of 
the Jena code base.


It is a significant addition so I think we need to get a Software Grant 
for it to keep everything neat and tidy process-wise.


Having read the intellectual property clearance process [1]  and looked 
at the list of previous contributions in the Foundation, I don't now 
think we need a Software Grant. While the CCLA has a "Software Grant" 
section, the ICLA does not.


We would need one there was an owning organisation involved, but here we 
have an individual contribution, so the contributor would be signing 
both ICLA and Software Grant - nothing is gained.


[1] https://incubator.apache.org/ip-clearance/index.html

Andy


The PR builds (maven + OpenJDK), some Javadoc warnings.
It doesn't compile in Eclipse. It looks like the errors are generics 
related.


Questions about the technical content of the contribution on the PR please.

     Andy


Contribution of jena-ontapi module.

2024-04-20 Thread Andy Seaborne

PMC,

We've received a contribution which includes OWL2 support.

https://github.com/apache/jena/issues/2160 "Support for OWL2"
https://github.com/apache/jena/pull/2420 "jena-ontapi module"

README:
https://github.com/apache/jena/pull/2420/files#diff-c8f3f6da514f1c8fd82f305b56cfac2b95784632984810e01943bfd16befe82a

The contribution is self-contained - it doesn't alter any other part of 
the Jena code base.


It is a significant addition so I think we need to get a Software Grant 
for it to keep everything neat and tidy process-wise.


The PR builds (maven + OpenJDK), some Javadoc warnings.
It doesn't compile in Eclipse. It looks like the errors are generics 
related.


Questions about the technical content of the contribution on the PR please.

Andy


[ANN] New PMC member : Arne Bernhardt

2024-04-05 Thread Andy Seaborne
We are pleased to announce that Arne has accepted an invitation to join 
the Apache Jena PMC.


Please welcome Arne to this role!

Andy


[RESULT] [VOTE] Apache Jena 5.0.0

2024-03-20 Thread Andy Seaborne



The vote passes with 3 PMCs members (Rob, Claude, Andy) and two 
community votes from Arne and Marco.


On to pushing out the release ...

Andy

On 16/03/2024 18:32, Andy Seaborne wrote:

Hi,

Here is a vote on the release of Apache Jena version 5.0.0.

 Release Vote

This vote will be open until at least

     Wednesday 20th March 2024 at 08:00 UTC

Please vote to approve this release:

     [ ] +1 Approve the release
     [ ]  0 Don't care
     [ ] -1 Don't release, because ...


Re: [] Apache Jena 5.0.0

2024-03-16 Thread Andy Seaborne




On 16/03/2024 21:48, Andy Seaborne wrote:



On 16/03/2024 19:50, Arne Bernhardt wrote:

Hi,

it may be nothing but on my system there are a few "ERROR "s in the 
console that I can't categorise (see attached log). The general result 
is a successful build.

For example on line 4250:
"[ERROR] There are test failures.
Failed to run task: 'yarn run test:e2e' failed.
com.github.eirslett.maven.plugins.frontend.lib.TaskRunnerException: 
'yarn run test:e2e' failed. ..."


Hi Arne - thanks for checking the release.

This is from the jena-fuseki-ui.

It looks like a failure to run the test framework, not a test failure.
The e2e test framework is sensitive to the environment.

"Process exited with an error: 1 (Exit value: 1)" isn't the most 
informative of error messages :-)


It may be because (despite the yarn,node download) something in the 
toolchain is an old version.


I have:
   node --version => v18.19.1
   yarn --version => 1.22.19
   npm --version  => 10.2.4

I checked the bots. The github action for MS Windows runs it fine; the 
Jenkins Windows job has the report you have.


Mistake - the GH action is also failing (I was looking at the wrong OS).

Maybe this exec'ed on Windows is the cause:
`yarn run serve:fuseki`

Logged as:
https://github.com/apache/jena/issues/2344

I'd still like to continue with this release if someone can confirm the 
Fuseki UI is produced and put into Fuseki/webapp as expected.


Andy



This maven module produces jena-fuseki-ui-5.0.0.jar and this is unpacked 
in jena-fuseki-webapp:pom.xml by maven-dependency-plugin. (a way to pass 
the built vue app through the build artifacts).


The jena-fuseki-webapp build step succeeded so it looks like 
jena-fuseki-ui jar was produced.


The build I did for the release was on Linux and the e2e:test passed in 
the release build and all the subsequent checking.


So it runs the tests, and they pass, sometimes.
I think we can continue and address the issue as part of regular 
development if that's OK.


     Andy



Arne

Am Sa., 16. März 2024 um 19:34 Uhr schrieb Andy Seaborne 
mailto:a...@apache.org>>:


    Hi,

    Here is a vote on the release of Apache Jena version 5.0.0.

     Release Vote

    This vote will be open until at least

      Wednesday 20th March 2024 at 08:00 UTC

    Please vote to approve this release:

          [ ] +1 Approve the release
          [ ]  0 Don't care
          [ ] -1 Don't release, because ...

    Everyone, not just committers, is invited to test and vote.
    Please download and test the proposed release. See the checklist 
below.


    Staging repository:

https://repository.apache.org/content/repositories/orgapachejena-1063 
<https://repository.apache.org/content/repositories/orgapachejena-1063>


    Proposed dist/ area:
    https://dist.apache.org/repos/dist/dev/jena/
    <https://dist.apache.org/repos/dist/dev/jena/>

    Keys:
    https://svn.apache.org/repos/asf/jena/dist/KEYS
    <https://svn.apache.org/repos/asf/jena/dist/KEYS>

    Git commit (browser URL):
    https://github.com/apache/jena/commit/f475cdc84a
    <https://github.com/apache/jena/commit/f475cdc84a>

    Git Commit Hash:
    f475cdc84a85e48c22a2c6487141e2d782c10517

    Git Commit Tag:
    jena-5.0.0

    If you expect to check the release but the time limit does not work
    for you, please email within the schedule above.

      Andy


     About Jena5 

    == General

    Issues since Jena 4.10.0:

    https://s.apache.org/jena-5.0.0-issues
    <https://s.apache.org/jena-5.0.0-issues>

    which includes the ones specifically related to Jena5:

    https://github.com/apache/jena/issues?q=label%3Ajena5
    <https://github.com/apache/jena/issues?q=label%3Ajena5>


    ** Java Requirement

    Java 17 or later is required.
    Java 17 language constructs now are used in the codebase.

    ** Language tags

    Language tags become are case-insensitive unique.

    "abc"@EN and "abc"@en are the same RDF term.

    Internally, language tags are formatted using the algorithm of RFC 
5646.


    Examples "@en", "@en-GB", "@en-Latn-GB".

    SPARQL LANG(?literal) will return a formatted language tag.

    Data stored in TDB using language tags must be reloaded.

    ** Term graphs

    Graphs are now term graphs in the API or SPARQL. That is, they do not
    match "same value" for some of the java mapped datatypes. The 
model API

    already normalizes values written.

    TDB1, TDB2 keep their value canonicalization during data loading.

    A legacy value-graph implementation can be obtained from
    GraphMemFactory.

    ** RRX - New RDF/XML parser

    RRX is the default RDF/XML parser. It is a replacement for ARP.
    RIOT uses RRX.

    The ARP parser is still temporarily available 

Re: [] Apache Jena 5.0.0

2024-03-16 Thread Andy Seaborne




On 16/03/2024 19:50, Arne Bernhardt wrote:

Hi,

it may be nothing but on my system there are a few "ERROR "s in the 
console that I can't categorise (see attached log). The general result 
is a successful build.

For example on line 4250:
"[ERROR] There are test failures.
Failed to run task: 'yarn run test:e2e' failed.
com.github.eirslett.maven.plugins.frontend.lib.TaskRunnerException: 
'yarn run test:e2e' failed. ..."


Hi Arne - thanks for checking the release.

This is from the jena-fuseki-ui.

It looks like a failure to run the test framework, not a test failure.
The e2e test framework is sensitive to the environment.

"Process exited with an error: 1 (Exit value: 1)" isn't the most 
informative of error messages :-)


It may be because (despite the yarn,node download) something in the 
toolchain is an old version.


I have:
  node --version => v18.19.1
  yarn --version => 1.22.19
  npm --version => 10.2.4

I checked the bots. The github action for MS Windows runs it fine; the 
Jenkins Windows job has the report you have.


This maven module produces jena-fuseki-ui-5.0.0.jar and this isunpacked 
in jena-fuseki-webapp:pom.xml by maven-dependency-plugin. (a way to pass 
the built vue app through the build artifacts).


The jena-fuseki-webapp build step succeeded so it looks like 
jena-fuseki-ui jar was produced.


The build I did for the release was on Linux and the e2e:test passed in 
the release build and all the subsequent checking.


So it runs the tests, and they pass, sometimes.
I think we can continue and address the issue as part of regular 
development if that's OK.


Andy



Arne

Am Sa., 16. März 2024 um 19:34 Uhr schrieb Andy Seaborne 
mailto:a...@apache.org>>:


Hi,

Here is a vote on the release of Apache Jena version 5.0.0.

 Release Vote

This vote will be open until at least

      Wednesday 20th March 2024 at 08:00 UTC

Please vote to approve this release:

          [ ] +1 Approve the release
          [ ]  0 Don't care
          [ ] -1 Don't release, because ...

Everyone, not just committers, is invited to test and vote.
Please download and test the proposed release. See the checklist below.

Staging repository:
https://repository.apache.org/content/repositories/orgapachejena-1063 
<https://repository.apache.org/content/repositories/orgapachejena-1063>

Proposed dist/ area:
https://dist.apache.org/repos/dist/dev/jena/
<https://dist.apache.org/repos/dist/dev/jena/>

Keys:
https://svn.apache.org/repos/asf/jena/dist/KEYS
<https://svn.apache.org/repos/asf/jena/dist/KEYS>

Git commit (browser URL):
https://github.com/apache/jena/commit/f475cdc84a
<https://github.com/apache/jena/commit/f475cdc84a>

Git Commit Hash:
    f475cdc84a85e48c22a2c6487141e2d782c10517

Git Commit Tag:
    jena-5.0.0

If you expect to check the release but the time limit does not work
for you, please email within the schedule above.

      Andy


 About Jena5 

== General

Issues since Jena 4.10.0:

https://s.apache.org/jena-5.0.0-issues
<https://s.apache.org/jena-5.0.0-issues>

which includes the ones specifically related to Jena5:

https://github.com/apache/jena/issues?q=label%3Ajena5
<https://github.com/apache/jena/issues?q=label%3Ajena5>


** Java Requirement

Java 17 or later is required.
Java 17 language constructs now are used in the codebase.

** Language tags

Language tags become are case-insensitive unique.

"abc"@EN and "abc"@en are the same RDF term.

Internally, language tags are formatted using the algorithm of RFC 5646.

Examples "@en", "@en-GB", "@en-Latn-GB".

SPARQL LANG(?literal) will return a formatted language tag.

Data stored in TDB using language tags must be reloaded.

** Term graphs

Graphs are now term graphs in the API or SPARQL. That is, they do not
match "same value" for some of the java mapped datatypes. The model API
already normalizes values written.

TDB1, TDB2 keep their value canonicalization during data loading.

A legacy value-graph implementation can be obtained from
GraphMemFactory.

** RRX - New RDF/XML parser

RRX is the default RDF/XML parser. It is a replacement for ARP.
RIOT uses RRX.

The ARP parser is still temporarily available for transition assistance.

** Remove support for JSON-LD 1.0

JSON-LD 1.1, using Titanium-JSON-LD, is the supported version of
JSON-LD.

https://github.com/filip26/titanium-json-ld
<https://github.com/filip26/titanium-json-ld>

** Turtle/Trig Output

"PREFIX" and "BASE" are output by default for Turtle and TriG output.

** Misc

There is now a rel

Re: [VOTE] Apache Jena 5.0.0

2024-03-16 Thread Andy Seaborne

[x] +1 Approve the release

Andy

On 16/03/2024 18:32, Andy Seaborne wrote:

Hi,

Here is a vote on the release of Apache Jena version 5.0.0.

 Release Vote

This vote will be open until at least

     Wednesday 20th March 2024 at 08:00 UTC

Please vote to approve this release:

     [ ] +1 Approve the release
     [ ]  0 Don't care
     [ ] -1 Don't release, because ...

Everyone, not just committers, is invited to test and vote.
Please download and test the proposed release. See the checklist below.

Staging repository:
   https://repository.apache.org/content/repositories/orgapachejena-1063

Proposed dist/ area:
   https://dist.apache.org/repos/dist/dev/jena/

Keys:
   https://svn.apache.org/repos/asf/jena/dist/KEYS

Git commit (browser URL):
   https://github.com/apache/jena/commit/f475cdc84a

Git Commit Hash:
   f475cdc84a85e48c22a2c6487141e2d782c10517

Git Commit Tag:
   jena-5.0.0

If you expect to check the release but the time limit does not work
for you, please email within the schedule above.

     Andy


 About Jena5 

== General

Issues since Jena 4.10.0:

   https://s.apache.org/jena-5.0.0-issues

which includes the ones specifically related to Jena5:

   https://github.com/apache/jena/issues?q=label%3Ajena5


** Java Requirement

Java 17 or later is required.
Java 17 language constructs now are used in the codebase.

** Language tags

Language tags become are case-insensitive unique.

"abc"@EN and "abc"@en are the same RDF term.

Internally, language tags are formatted using the algorithm of RFC 5646.

Examples "@en", "@en-GB", "@en-Latn-GB".

SPARQL LANG(?literal) will return a formatted language tag.

Data stored in TDB using language tags must be reloaded.

** Term graphs

Graphs are now term graphs in the API or SPARQL. That is, they do not 
match "same value" for some of the java mapped datatypes. The model API 
already normalizes values written.


TDB1, TDB2 keep their value canonicalization during data loading.

A legacy value-graph implementation can be obtained from GraphMemFactory.

** RRX - New RDF/XML parser

RRX is the default RDF/XML parser. It is a replacement for ARP.
RIOT uses RRX.

The ARP parser is still temporarily available for transition assistance.

** Remove support for JSON-LD 1.0

JSON-LD 1.1, using Titanium-JSON-LD, is the supported version of JSON-LD.

https://github.com/filip26/titanium-json-ld

** Turtle/Trig Output

"PREFIX" and "BASE" are output by default for Turtle and TriG output.

** Misc

There is now a release BOM for Jena artifacts - artifact 
org.apache.jena:jena-bom


There are now OWASP CycloneDX SBOM for Jena artifacts.
https://github.com/CycloneDX


 API Users

** Deprecation removal

There has been a clearing out of deprecated functions, methods and 
classes. This includes the deprecations in Jena 4.10.0 added to show 
code that is being removed in Jena5.


** QueryExecutionFactory

QueryExecutionFactory is simplified to cover commons cases only; it 
becomes a way to call the general QueryExecution builders which are 
preferred and provide all full query execution setup controls.


Local execution builder:
QueryExecution.create()...

Remote execution builder:
QueryExecution.service(URL)...

** QueryExecution variable substitution

Using "substitution", where the query is modified by replacing one or 
more variables by RDF terms, is now preferred to using "initial 
bindings", where query solutions include (var,value) pairs.


"substitution" is available for all queries, local and remote, not just 
local executions.


Rename TDB1 packages org.apache.jena.tdb -> org.apache.jena.tdb1

The update to slf4j 2.x means any use of log4j should use artifact 
"log4j-slf4j2-impl" (was "log4j-slf4j-impl").



 Fuseki Users

Fuseki: Uses the jakarta namespace for servlets and Fuseki has been 
upgraded to use Eclipse Jetty12.


Apache Tomcat10 or later, is required for running the WAR file.
Tomcat 9 or earlier will not work.


---


Checking:

+ are the GPG signatures fine?
+ are the checksums correct?
+ is there a source archive?
+ can the source archive be built?
   (NB This requires a "mvn install" first time)
+ is there a correct LICENSE and NOTICE file in each artifact
   (both source and binary artifacts)?
+ does the NOTICE file contain all necessary attributions?
+ have any licenses of dependencies changed due to upgrades?
    if so have LICENSE and NOTICE been upgraded appropriately?
+ does the tag/commit in the SCM contain reproducible sources?


[VOTE] Apache Jena 5.0.0

2024-03-16 Thread Andy Seaborne

Hi,

Here is a vote on the release of Apache Jena version 5.0.0.

 Release Vote

This vote will be open until at least

Wednesday 20th March 2024 at 08:00 UTC

Please vote to approve this release:

[ ] +1 Approve the release
[ ]  0 Don't care
[ ] -1 Don't release, because ...

Everyone, not just committers, is invited to test and vote.
Please download and test the proposed release. See the checklist below.

Staging repository:
  https://repository.apache.org/content/repositories/orgapachejena-1063

Proposed dist/ area:
  https://dist.apache.org/repos/dist/dev/jena/

Keys:
  https://svn.apache.org/repos/asf/jena/dist/KEYS

Git commit (browser URL):
  https://github.com/apache/jena/commit/f475cdc84a

Git Commit Hash:
  f475cdc84a85e48c22a2c6487141e2d782c10517

Git Commit Tag:
  jena-5.0.0

If you expect to check the release but the time limit does not work
for you, please email within the schedule above.

Andy


 About Jena5 

== General

Issues since Jena 4.10.0:

  https://s.apache.org/jena-5.0.0-issues

which includes the ones specifically related to Jena5:

  https://github.com/apache/jena/issues?q=label%3Ajena5


** Java Requirement

Java 17 or later is required.
Java 17 language constructs now are used in the codebase.

** Language tags

Language tags become are case-insensitive unique.

"abc"@EN and "abc"@en are the same RDF term.

Internally, language tags are formatted using the algorithm of RFC 5646.

Examples "@en", "@en-GB", "@en-Latn-GB".

SPARQL LANG(?literal) will return a formatted language tag.

Data stored in TDB using language tags must be reloaded.

** Term graphs

Graphs are now term graphs in the API or SPARQL. That is, they do not 
match "same value" for some of the java mapped datatypes. The model API 
already normalizes values written.


TDB1, TDB2 keep their value canonicalization during data loading.

A legacy value-graph implementation can be obtained from GraphMemFactory.

** RRX - New RDF/XML parser

RRX is the default RDF/XML parser. It is a replacement for ARP.
RIOT uses RRX.

The ARP parser is still temporarily available for transition assistance.

** Remove support for JSON-LD 1.0

JSON-LD 1.1, using Titanium-JSON-LD, is the supported version of JSON-LD.

https://github.com/filip26/titanium-json-ld

** Turtle/Trig Output

"PREFIX" and "BASE" are output by default for Turtle and TriG output.

** Misc

There is now a release BOM for Jena artifacts - artifact 
org.apache.jena:jena-bom


There are now OWASP CycloneDX SBOM for Jena artifacts.
https://github.com/CycloneDX


 API Users

** Deprecation removal

There has been a clearing out of deprecated functions, methods and 
classes. This includes the deprecations in Jena 4.10.0 added to show 
code that is being removed in Jena5.


** QueryExecutionFactory

QueryExecutionFactory is simplified to cover commons cases only; it 
becomes a way to call the general QueryExecution builders which are 
preferred and provide all full query execution setup controls.


Local execution builder:
QueryExecution.create()...

Remote execution builder:
QueryExecution.service(URL)...

** QueryExecution variable substitution

Using "substitution", where the query is modified by replacing one or 
more variables by RDF terms, is now preferred to using "initial 
bindings", where query solutions include (var,value) pairs.


"substitution" is available for all queries, local and remote, not just 
local executions.


Rename TDB1 packages org.apache.jena.tdb -> org.apache.jena.tdb1

The update to slf4j 2.x means any use of log4j should use artifact 
"log4j-slf4j2-impl" (was "log4j-slf4j-impl").



 Fuseki Users

Fuseki: Uses the jakarta namespace for servlets and Fuseki has been 
upgraded to use Eclipse Jetty12.


Apache Tomcat10 or later, is required for running the WAR file.
Tomcat 9 or earlier will not work.


---


Checking:

+ are the GPG signatures fine?
+ are the checksums correct?
+ is there a source archive?
+ can the source archive be built?
  (NB This requires a "mvn install" first time)
+ is there a correct LICENSE and NOTICE file in each artifact
  (both source and binary artifacts)?
+ does the NOTICE file contain all necessary attributions?
+ have any licenses of dependencies changed due to upgrades?
   if so have LICENSE and NOTICE been upgraded appropriately?
+ does the tag/commit in the SCM contain reproducible sources?


Re: Towards Jena 5.0.0

2024-03-16 Thread Andy Seaborne

Now doing a build!

Recent work includes:

* Arne provided faster memory term graph copying
* Bruno improved the display of graph names when there are
  a lot of graphs in a dataset
* Rob converted the GeoSPARQL caching to use Caffeine
* Apache Common2 Compress upgrade to 0.26 (addresses CVE-2024-25710)
* TDB2 compaction is now robust on all operating systems.
* MS Windows tests pass regularly (Jenkins and Github actions)
* Added JUnit5 dependencies

https://github.com/apache/jena/issues?q=is%3Aissue+closed%3A2024-02-10..2024-07-01+-label%3Aquestion

On 15/03/2024 16:06, Bruno Kinoshita wrote:

+1, I think 5.0.0 can go out and we can continue working on things for
5.0.1, 5.0.2, ..., 5.1, etc. :)

Thanks Andy!


Re: Towards Jena 5.0.0

2024-03-14 Thread Andy Seaborne

Status:

There's code for safe compaction on MS Windows now.

https://github.com/apache/jena/pull/2321

There are non-deterministic test failures build on MS Windows only, and 
more likely when the build server is busy.


A few more details on
https://github.com/apache/jena/issues/2328
https://github.com/apache/jena/pull/2329

I'd like to proceed with Jena 5.0.0 with the partial improvement applied 
and then clean the rest of the cases, allowing for any proper rewrites 
to improve an area of code, rather than only fixing the presenting problem.


Andy


Re: Towards Jena 5.0.0

2024-03-07 Thread Andy Seaborne



On 07/03/2024 10:44, Andy Seaborne wrote:
...

== Changes of note



TDB2 Compaction make robust against exceptions and server restart


is causing a problem on Windows. Moving a directory with memory mapped 
files does not work on MS Windows.


https://github.com/apache/jena/issues/2315

The good news is that this does showup on both Jenkins/windows build and 
in the Windows github action workflow.


A immediate fix is to use the non-robust code on Windows pro-tem.
Using a transient file to record the highest numbered complete storage 
database is one approach to a more complete solution longer term.


Andy

Original report:
https://github.com/apache/jena/issues/2254


Re: Towards Jena 5.0.0

2024-03-07 Thread Andy Seaborne

It's looking good for Apache Jena 5.0.0 one month(ish) after 5.0.0-rc1.

== Changes since 5.0.0-rc1

Closed issues which are not questions:

https://github.com/apache/jena/issues?q=is%3Aissue+closed%3A2024-02-10..2024-07-01+-label%3Aquestion

== Changes of note

Configurable CORS headers for Fuseki
  from @TelicentPaul

Explicit Accept headers on RDFConnectionRemote fix
  from @Aklakan

TDB2 Compaction make robust against exceptions and server restart

Implement xsd:duration divide operations

Better LATERAL implementation

== Dependency updates of note

Lucene upgrade from 9.9 to 9.10

JSON-LD upgrade
  @filip26 released titanium-json-ld 1.4.0


Re: New jena Jira account requested:

2024-03-01 Thread Andy Seaborne




On 29/02/2024 19:23, Andy Seaborne wrote:



On 28/02/2024 13:07, Andy Seaborne wrote:

If your project no longer uses Jira for issue tracking, you can have your
    project removed from the dropdown list, preventing more 
requests like this
    one. Create an INFRA jira, email users@infra.a.o or contact us 
in the #asfinfra

    channel in the-asf Slack instance.


Requested:

https://issues.apache.org/jira/browse/INFRA-25568


Infra have done this.

Sometime, we should make Jena JIRA read-only after checking the website 
has no more mention of JIRA.


Andy



Re: New jena Jira account requested:

2024-02-29 Thread Andy Seaborne




On 28/02/2024 13:07, Andy Seaborne wrote:

If your project no longer uses Jira for issue tracking, you can have your
    project removed from the dropdown list, preventing more requests 
like this
    one. Create an INFRA jira, email users@infra.a.o or contact us 
in the #asfinfra

    channel in the-asf Slack instance.


Requested:

https://issues.apache.org/jira/browse/INFRA-25568


Re: New jena Jira account requested:

2024-02-28 Thread Andy Seaborne



On 27/02/2024 09:49, ASF Self-serve Portal wrote:



...



Note: If your project no longer uses Jira for issue tracking, you can have your
   project removed from the dropdown list, preventing more requests like 
this
   one. Create an INFRA jira, email users@infra.a.o or contact us in the 
#asfinfra
   channel in the-asf Slack instance.



Shall we get Jena removed from the JIRA dropdown for requesting access?

We aren't getting genuine JIRA signup requests - the old issues are anon 
readable and exising people can stil do the JIRA thing if they really 
want to.


Andy


[RESULT] [VOTE] Apache Jena 5.0.0-rc1 (first call)

2024-02-14 Thread Andy Seaborne

The VOTE passes with +1 from Rob, Bruno and Andy

I'll move on to getting the release out.

The ANN message will stress the opportunity for review and feedback 
before release of 5.0.0 in a month or so.


Thanks everyone.

Andy

On 10/02/2024 20:34, Andy Seaborne wrote:

Hi,

Here is a vote on the release of Apache Jena version "5.0.0-rc1".

The release candidate is made for wider review and feedback. It will 
hopefully be for a period of a month after which Jena 5.0.0 will be 
released.


Normal Jena development for fixes and improvements that do not cause 
change of functionality will continue as usual.


 Release Vote

This vote will be open until at least

     Wednesday 14th February 2024 at 08:00 UTC

Please vote to approve this release:

     [ ] +1 Approve the release
     [ ]  0 Don't care
     [ ] -1 Don't release, because ...

Everyone, not just committers, is invited to test and vote.
Please download and test the proposed release. See the checklist below.

Staging repository:
   https://repository.apache.org/content/repositories/orgapachejena-1062

Proposed dist/ area:
   https://dist.apache.org/repos/dist/dev/jena/

Keys:
   https://svn.apache.org/repos/asf/jena/dist/KEYS

Git commit (browser URL):
   https://github.com/apache/jena/commit/c44b77d3ff

Git Commit Hash:
   c44b77d3ffc04c25ee369c3af928fd8fe1394453

Git Commit Tag:
   jena-5.0.0-rc1

If you expect to check the release but the time limit does not work
for you, please email within the schedule above.


Re: [] Apache Jena 5.0.0-rc1 (first call)

2024-02-12 Thread Andy Seaborne




On 12/02/2024 14:18, Rob @ DNR wrote:

+1 (binding)

Built and verified on OS X

Review notes:


   *   Some of the NOTICE files (specifically the one that ends up in 
jena-fuseki2/jena-fuseki-server/src/main/resources/META-INF/NOTICE) still 
reference jsonld-java which we no longer use/bundle.  The only dependency on it 
I can find is via the shaded Jena 4.8 module in the benchmarking module


Thanks - yes, it can be removed from jena-fuseki-server. I'll put a PR in.

Andy



Rob

From: Andy Seaborne 
Date: Saturday, 10 February 2024 at 20:48
To: dev@jena.apache.org 
Subject: Re: [VOTE] Apache Jena 5.0.0-rc1 (first call)


On 10/02/2024 20:34, Andy Seaborne wrote:

Hi,

Here is a vote on the release of Apache Jena version "5.0.0-rc1".

The release candidate is made for wider review and feedback. It will
hopefully be for a period of a month after which Jena 5.0.0 will be
released.

Normal Jena development for fixes and improvements that do not cause
change of functionality will continue as usual.

 Release Vote

This vote will be open until at least

  Wednesday 14th February 2024 at 08:00 UTC

Please vote to approve this release:

  [x] +1 Approve the release
  [ ]  0 Don't care
  [ ] -1 Don't release, because ...


+1 (binding)

  Andy



Re: [VOTE] Apache Jena 5.0.0-rc1 (first call)

2024-02-10 Thread Andy Seaborne




On 10/02/2024 20:34, Andy Seaborne wrote:

Hi,

Here is a vote on the release of Apache Jena version "5.0.0-rc1".

The release candidate is made for wider review and feedback. It will 
hopefully be for a period of a month after which Jena 5.0.0 will be 
released.


Normal Jena development for fixes and improvements that do not cause 
change of functionality will continue as usual.


 Release Vote

This vote will be open until at least

     Wednesday 14th February 2024 at 08:00 UTC

Please vote to approve this release:

     [x] +1 Approve the release
     [ ]  0 Don't care
     [ ] -1 Don't release, because ...


+1 (binding)

Andy


[VOTE] Apache Jena 5.0.0-rc1 (first call)

2024-02-10 Thread Andy Seaborne

Hi,

Here is a vote on the release of Apache Jena version "5.0.0-rc1".

The release candidate is made for wider review and feedback. It will 
hopefully be for a period of a month after which Jena 5.0.0 will be 
released.


Normal Jena development for fixes and improvements that do not cause 
change of functionality will continue as usual.


 Release Vote

This vote will be open until at least

Wednesday 14th February 2024 at 08:00 UTC

Please vote to approve this release:

[ ] +1 Approve the release
[ ]  0 Don't care
[ ] -1 Don't release, because ...

Everyone, not just committers, is invited to test and vote.
Please download and test the proposed release. See the checklist below.

Staging repository:
  https://repository.apache.org/content/repositories/orgapachejena-1062

Proposed dist/ area:
  https://dist.apache.org/repos/dist/dev/jena/

Keys:
  https://svn.apache.org/repos/asf/jena/dist/KEYS

Git commit (browser URL):
  https://github.com/apache/jena/commit/c44b77d3ff

Git Commit Hash:
  c44b77d3ffc04c25ee369c3af928fd8fe1394453

Git Commit Tag:
  jena-5.0.0-rc1

If you expect to check the release but the time limit does not work
for you, please email within the schedule above.

Andy


 About Jena5 


== General

Issues since Jena 4.10.0:

  https://s.apache.org/jena-5.0.0-issues

which includes the ones specifically related to Jena5:

  https://github.com/apache/jena/issues?q=label%3Ajena5


** Java Requirement

Java 17 or later is required.
Java 17 language constructs now are used in the codebase.

** Language tags

Language tags become are case-insensitive unique.

"abc"@EN and "abc"@en are the same RDF term.

Internally, language tags are formatted using the algorithm of RFC 5646.

Examples "@en", "@en-GB", "@en-Latn-GB".

SPARQL LANG(?literal) will return a formatted language tag.

Data stored in TDB using language tags must be reloaded.

** Term graphs

Graphs are now term graphs in the API or SPARQL. That is, they do not 
match "same value" for some of the java mapped datatypes. The model API 
already normalizes values written.


TDB1, TDB2 keep their value canonicalization during data loading.

A legacy value-graph implementation can be obtained from GraphMemFactory.

** RRX - New RDF/XML parser

RRX is the default RDF/XML parser. It is a replacement for ARP.
RIOT uses RRX.

The ARP parser is still temporarily available for transition assistance.

** Remove support for JSON-LD 1.0

JSON-LD 1.1, using Titanium-JSON-LD, is the supported version of JSON-LD.

https://github.com/filip26/titanium-json-ld

** Turtle/Trig Output

"PREFIX" and "BASE" are output by default for Turtle and TriG output.

** Misc

There is now a release BOM for Jena artifacts - artifact 
org.apache.jena:jena-bom


There are now OWASP CycloneDX SBOM for Jena artifacts.
https://github.com/CycloneDX


 API Users

** Deprecation removal

There has been a clearing out of deprecated functions, methods and 
classes. This includes the deprecations in Jena 4.10.0 added to show 
code that is being removed in Jena5.


** QueryExecutionFactory

QueryExecutionFactory is simplified to cover commons cases only; it 
becomes a way to call the general QueryExecution builders which are 
preferred and provide all full query execution setup controls.


Local execution builder:
QueryExecution.create()...

Remote execution builder:
QueryExecution.service(URL)...

** QueryExecution variable substitution

Using "substitution", where the query is modified by replacing one or 
more variables by RDF terms, is now preferred to using "initial 
bindings", where query solutions include (var,value) pairs.


"substitution" is available for all queries, local and remote, not just 
local executions.


Rename TDB1 packages org.apache.jena.tdb -> org.apache.jena.tdb1

The update to slf4j 2.x means any use of log4j should use artifact 
"log4j-slf4j2-impl" (was "log4j-slf4j-impl").



 Fuseki Users

Fuseki: Uses the jakarta namespace for servlets and Fuseki has been 
upgraded to use Eclipse Jetty12.


Apache Tomcat10 or later, is required for running the WAR file.
Tomcat 9 or earlier will not work.


---


Checking:

+ are the GPG signatures fine?
+ are the checksums correct?
+ is there a source archive?
+ can the source archive be built?
  (NB This requires a "mvn install" first time)
+ is there a correct LICENSE and NOTICE file in each artifact
  (both source and binary artifacts)?
+ does the NOTICE file contain all necessary attributions?
+ have any licenses of dependencies changed due to upgrades?
   if so have LICENSE and NOTICE been upgraded appropriately?
+ does the tag/commit in the SCM contain reproducible sources?


Re: Towards Jena 5.0.0

2024-02-10 Thread Andy Seaborne




On 02/02/2024 17:04, Andy Seaborne wrote:



On 01/02/2024 13:20, Andy Seaborne wrote:

It's about time for Jena 5.0.0.


I'm going to do the build as 5.0.0-rc1.

First attempt failed - some post-maven build checking showed that the 
command line tools and binary packaging needs to include the slf4j v2 
artifacts for log4j.


Andy



Re: Towards Jena 5.0.0

2024-02-02 Thread Andy Seaborne




On 01/02/2024 13:20, Andy Seaborne wrote:

It's about time for Jena 5.0.0.



== Current state

1/
There is a test failure on Windows around determining for a base URI 
involving files. This needs investigating and correcting.


PR available.



2/
We have a problem with the UI part of the build on Jenkins.


Good news! INFRA did a bunch of updates late last year and now general 
build servers can run node20 used in the Fuskei UI build.


Andy



Towards Jena 5.0.0

2024-02-01 Thread Andy Seaborne

It's about time for Jena 5.0.0.

The most significant application and user visible changes include:

- require java17
- code cleaup of deprecated methods and classes
- Remove JSON-LD 1.0 support
- Default Turtle output to use PREFIX
- Replace ARP with RRX (RDF/XML parsing)
- Rename artifact jena-tdb as jena-tdb1.

A question is whether to have 5.0.0-RC1 or 5.0.0.

If it's 5.0.0-RC1, then my suggestion is to try for a one month RC then 
release Jena 5.0.0.


I prefer having an RC cycle.

From my POV the current code in main is the same readiness as any other 
release. An RC is for feedback on the major version level changes.


Feedback doesn't always arrive. Waiting a full 3 month cycle is too long.

Do PMC member have the bandwidth to VOTE on 2 release is this shortened 
time?


Andy

Issues since Jena 4.10.0:

  https://s.apache.org/jena-5.0.0-issues

which includes the ones specifically related to Jena5:

  https://github.com/apache/jena/issues?q=label%3Ajena5

== Current state

1/
There is a test failure on Windows around determining for a base URI 
involving files. This needs investigating and correcting.


2/
We have a problem with the UI part of the build on Jenkins.

The build servers are Ubtuntu 18.04 which has an old version of glibc 
and this cause node 18+ to fail. Only node16 is available.


We have 3 dependabot upgrades pending because of this.

There is work on a jenkins pipeline
https://ci-builds.apache.org/job/Jena/job/Jena-pipeline/

[INFO] EACCES: permission denied, mkdir '/.cache'

(it's running in a container - HOME is '/' and the default Cyopress 
cache is "~/.cache").


The build does work using the Github actions aside from (1)

((
PS - Update - managed to a build of main to pass!
))

3/
Outstanding - LATERAL

The implementation of LATERAL is "weak", to put it politely, as it makes 
assumptions about how query execution works and fails in a recently 
identified case. I have a reworked implementation and I'm currently 
writing test for it.


It would be good to include it but it's not essential.

4/
Outstanding - tdb commands

Rework the TDB command line tools to favour TDB2 when creating a new 
database, and to work on either TDB1 or TDB2 by inspecting the database 
directory.


Output of bad URIs

2024-01-14 Thread Andy Seaborne

This is to highlight issue 2167.

https://github.com/apache/jena/issues/2167

What do if asked to print a URI string that has bad characters in it 
when outputting Turtle-family syntax.


[18]IRIREF  ::= '<' ([^#x00-#x20<>"{}|^`\] | UCHAR)* '>'

https://www.w3.org/TR/turtle/#grammar-production-IRIREF

Parsing also requires passing RFC 3986 in addition to the IRIREF rule.
There is no "fix the URI".

Percent encoding "encodes" - it changes the URI (the output URI string 
would not match the input).


The current PR - for discussion - puts in UCHAR (which is an escape 
mechanism). That at least then passes the IRREF rule but it is not a 
legal URI; it has a bad character in it.


Andy


Re: Jena5: what to expect

2023-12-28 Thread Andy Seaborne

In Jena4, jena-fuseki-fulljar is the WAR file code + Jetty.

Fuseki/main (jena-fuseki-server) is also already packaged with Jetty.

You may be thinking of changing jena-fuseki-fulljar (the standalone 
packaging of Fuseki+UI) to be constructed from Fuseki/main/Jetty + Admin 
code + UI.


That change is in theory transparent. It is unlikely to be in Jena 5.0.x

It may be better to take the opportunity to have variants like 
Fuseki+query (readonly, for publishing data), Fuseki+data workbench 
(query+update, but not create/delete databases) as well as the with the 
current UI.


Andy

On 28/12/2023 11:18, Marco Neumann wrote:

Hi Andy,
I remember reading about a replacement of jetty as the default servlet
container for fuseki. Is that still the case going forward?

Marco

On Thu, Dec 28, 2023 at 10:41 AM Andy Seaborne  wrote:


Jena5 is the next planned release for Apache Jena.

** All issues for Jena5:

https://github.com/apache/jena/issues?q=is%3Aissue+label%3AJena5

** Java Requirement

Java 17 or later is required.
Artifacts are Java17 bytecode.
Java 17 language constructs now are used in the codebase.

** Term graphs

Graphs are now term graphs in the API or SPARQL. That is, they do not
match "same value" for some of the java mapped datatypes. The model API
already normalizes values written.

The default in-memory graphs become term graphs.

TDB1, TDB2 keep their value canonicalization during data loading.

A legacy value-graph implementation can be obtained from GraphMemFactory.

** Language tags

Language tags become are case-insensitive unique.

"abc"@EN and "abc"@en are the same RDF term.

Internally, language tags are formatted using the algorithm of RFC 5646.

Examples "@en", "@en-GB", "@en-Latn-GB".

SPARQL LANG(?literal) will return a formatted language tag.

Data stored in TDB using language tags must be reloaded.

** RRX - New RDF/XML parser

RRX is the default RDF/XML parser. It is a replacement for ARP.
RIOT uses RRX.

* daml:collection is not supported.
* Strict rdf:parseType
* Relative namespaces supported.

The ARP parser is still temporarily available for transition assistance.

** Remove support for JSON-LD 1.0

JSON-LD 1.1, using Titanium-JSON-LD, is the supported version of JSON-LD.

https://github.com/filip26/titanium-json-ld

** Turtle/Trig Output

"PREFIX" and "BASE" are output by default for Turtle and TriG output.


 API Users

** Deprecation removal

There has been a general clearing out of deprecated functions, methods
and classes. This includes deprecations in Jena 4.10.0 added to show
code that is being removed in Jena5.

** QueryExecutionFactory

QueryExecutionFactory is simplified to cover commons cases only; it
becomes a way to call the general QueryExecution builders are full query
execution setup.

Local execution builder:
QueryExecution.create()...

Remote execution builder:
QueryExecution.service(URL)...

** QueryExecution variable substitution

Using "substitution", where the query is modified by replacing one or
more variables by RDF terms, is now preferred to using "initial
bindings", where query solutions include (var,value) pairs.

"substitution" is available for all queries, local and remote, not just
local executions.


 Fuseki Users

Fuseki: Uses the jakarta namespace for servlets and Fuseki has been
upgraded to use Eclipse Jetty12.

Apache Tomcat10 or later, is required for running the WAR file.
Tomcat 9 or earlier will not work.






Jena5: what to expect

2023-12-28 Thread Andy Seaborne

Jena5 is the next planned release for Apache Jena.

** All issues for Jena5:

https://github.com/apache/jena/issues?q=is%3Aissue+label%3AJena5

** Java Requirement

Java 17 or later is required.
Artifacts are Java17 bytecode.
Java 17 language constructs now are used in the codebase.

** Term graphs

Graphs are now term graphs in the API or SPARQL. That is, they do not 
match "same value" for some of the java mapped datatypes. The model API 
already normalizes values written.


The default in-memory graphs become term graphs.

TDB1, TDB2 keep their value canonicalization during data loading.

A legacy value-graph implementation can be obtained from GraphMemFactory.

** Language tags

Language tags become are case-insensitive unique.

"abc"@EN and "abc"@en are the same RDF term.

Internally, language tags are formatted using the algorithm of RFC 5646.

Examples "@en", "@en-GB", "@en-Latn-GB".

SPARQL LANG(?literal) will return a formatted language tag.

Data stored in TDB using language tags must be reloaded.

** RRX - New RDF/XML parser

RRX is the default RDF/XML parser. It is a replacement for ARP.
RIOT uses RRX.

* daml:collection is not supported.
* Strict rdf:parseType
* Relative namespaces supported.

The ARP parser is still temporarily available for transition assistance.

** Remove support for JSON-LD 1.0

JSON-LD 1.1, using Titanium-JSON-LD, is the supported version of JSON-LD.

https://github.com/filip26/titanium-json-ld

** Turtle/Trig Output

"PREFIX" and "BASE" are output by default for Turtle and TriG output.


 API Users

** Deprecation removal

There has been a general clearing out of deprecated functions, methods 
and classes. This includes deprecations in Jena 4.10.0 added to show 
code that is being removed in Jena5.


** QueryExecutionFactory

QueryExecutionFactory is simplified to cover commons cases only; it 
becomes a way to call the general QueryExecution builders are full query 
execution setup.


Local execution builder:
QueryExecution.create()...

Remote execution builder:
QueryExecution.service(URL)...

** QueryExecution variable substitution

Using "substitution", where the query is modified by replacing one or 
more variables by RDF terms, is now preferred to using "initial 
bindings", where query solutions include (var,value) pairs.


"substitution" is available for all queries, local and remote, not just 
local executions.



 Fuseki Users

Fuseki: Uses the jakarta namespace for servlets and Fuseki has been 
upgraded to use Eclipse Jetty12.


Apache Tomcat10 or later, is required for running the WAR file.
Tomcat 9 or earlier will not work.


Re: UI improvements Was: process question.

2023-11-29 Thread Andy Seaborne

Github issues for feature-scale things.
Ideally with associated pull request.

General discussions, dev@

Andy

On 29/11/2023 09:18, Marco Neumann wrote:

Bruno,
how do you gather input/ideas for UI improvements?

Best,
Marco


Re: process question.

2023-11-29 Thread Andy Seaborne

Claude,

For merging to main, we are moving towards "rebase and merge" away from 
"Create a merge commit". Squashing and tidy up is probably better done 
on the PR before the integration into main.


This is for keeping the long term history for main cleaner.

It may make sense to use a merge commit when history should preserve new 
functionality coming in - e.g. something large and significant we'd want 
to record.


Choose which you feel is appropriate but "rebase and merge" does produce 
more noise in the log history.


Andy

On 23/11/2023 15:22, Bruno Kinoshita wrote:

I think it depends. Sometimes I approve things that look good to me, but
you might still want to request an extra review from Andy or Rob as they
know the code base a lot better.


And this seems to be working.

A review has two aspects

- are there any wider issues? (e.g. it has a new dependency)- "process" 
has been followed

- review the code.

If a PR is ready, and only changes a clearly restricted area of code 
with no changes outside some subtree or small maven module, and passes 
the build - it doesn't bring everything down! - then consider merging it.


In the jargon "RTC", with modest "CTR" if it hangs around too long.
Always RTC would be ideal but we do have to be practical given the 
people-resources available.


There are github actions to run the build on a cloned repo or "mavn 
clean verify" locally.


Anyone - no just committers - can comment on PRs.

CTR = "Commit Then Review"
RTC = "Review Then Commit"




In the same manner, if you modify the UI and Rob, Andy, or anybody else
reviews it, I am always happy to be added as a second reviewer for
UI/JavaScript if needed.

For the documentation in jena-site, though, pull requests are held back
until we have the code they talk about released, so that the documentation
is not ahead of what users are able to use.

On Thu, 23 Nov 2023 at 13:16, Claude Warren  wrote:


I haven't been around for awhile so I have a process question.
How many reviews are required before code can be merged?

Claude

--
LinkedIn: http://www.linkedin.com/in/claudewarren





Re: Collection of paths?

2023-11-19 Thread Andy Seaborne

Paths as in SPARQL paths?
Paths aren't in the RDF data model.

If so, then try PathBlock which are behind the syntax element 
ElementPathBlock


Andy

On 19/11/2023 17:04, Claude Warren wrote:

RDF Collection provides a mechanism to create a list of Nodes.
Is there a similar construct to create a list of Paths?
I don't see one.

Claude



Re: dataset union query.

2023-11-17 Thread Andy Seaborne




On 17/11/2023 17:43, Claude Warren wrote:

OK.  PBKAC.  But I would like to know if there is a standard name for the
Union of the graphs in the dataset rather than the arq specific one.


No, there isn't.

There has been discussion (e.g. [1] and [2]) on common names.

What would make sense for jena is DEFAULT/UNION/ALL (= union and default).

Andy


[1]
https://github.com/w3c/sparql-dev/blob/main/SEP/SEP-0004/sep-0004.md
[2]
https://github.com/w3c/sparql-dev/issues/43#issuecomment-480726412



On Fri, Nov 17, 2023 at 6:29 PM Claude Warren  wrote:


is there a GRAPH name for the union of the models in a dataset?

I have tried: ASK FROM  { { 
(){+}  }}

now assuming that there is a  in one of the models of the dataset
it should return "true"

Am I missing something? If not, I think I have found a bug.

--
LinkedIn: http://www.linkedin.com/in/claudewarren






Re: Switching to Jena5 for development

2023-11-02 Thread Andy Seaborne




On 01/11/2023 17:58, Andy Seaborne wrote:
With the release of Jena 4.10.0, we can switch branch "main" to Jena5 
for development.


There'll be a branch "jena4", starting at the commit for the CHANGES 
update.


Then ...

   One last rebase of "main" into "jena5" and force push of "jena5"
   Merge (fast-forward) jena5 to main.
   Remove branch jena5.

     Andy


"main" is now Jena5/Java17 and the WAR file needs Tomcat10.

Branch "jena4" exists.
Branch "jena5" will go away when PRs targetting it move over to "main".

Java17 language features, not preview features, can go in.

There are some already. Probably the most useful is multiline strings 
for writing queries and data snippets for test cases :-)


--

The build is a bit messy - there are warnings to be investigated when 
using Java21.


Jenkins is building and deploying SNAPSHOT artifacts.
Repo:
  https://repository.apache.org/content/groups/snapshots/

Github actions:
  Linux/Ubuntu
Runs OK
  macOS
There is a test timeout issue in GeoSPARQL
  Solution: switch to a Caffeine time expiring cache.
  MS Windows
2 new tests for choosing the base with filename
with a drive letter fail without saying why.
WIP.

Andy


Switching to Jena5 for development

2023-11-01 Thread Andy Seaborne
With the release of Jena 4.10.0, we can switch branch "main" to Jena5 
for development.


There'll be a branch "jena4", starting at the commit for the CHANGES update.

Then ...

  One last rebase of "main" into "jena5" and force push of "jena5"
  Merge (fast-forward) jena5 to main.
  Remove branch jena5.

Andy


[RESULT] [VOTE] Apache Jena 4.10.0 RC 1

2023-11-01 Thread Andy Seaborne
The VOTE passes with with 3 PMC +1 Votes (Bruno, Rob, Andy) and one 
highly appreciated community vote from Marco.


I can get on with pushing out the artifacts. Other than the basic update 
of the download page, the rest of the website may have to follow in the 
next day or two.


Thanks
Andy

On 24/10/2023 13:52, Andy Seaborne wrote:

Hi,

Here is a vote on the release of Apache Jena 4.10.0.
This is the first release candidate.

The deadline is

     Friday, 27th October 2023 at 18:00 UTC

Please vote to approve this release:

     [ ] +1 Approve the release
     [ ]  0 Don't care
     [ ] -1 Don't release, because ...


Re: [] Apache Jena 4.10.0 RC 1

2023-10-30 Thread Andy Seaborne

Could we get another PMC vote please?

On 24/10/2023 13:52, Andy Seaborne wrote:

Hi,

Here is a vote on the release of Apache Jena 4.10.0.
This is the first release candidate.

The deadline is

     Friday, 27th October 2023 at 18:00 UTC


Re: [VOTE] Apache Jena 4.10.0 RC 1

2023-10-24 Thread Andy Seaborne

[x] +1 Approve the release

On 24/10/2023 13:52, Andy Seaborne wrote:

Hi,

Here is a vote on the release of Apache Jena 4.10.0.
This is the first release candidate.

The deadline is

     Friday, 27th October 2023 at 18:00 UTC

Please vote to approve this release:

     [ ] +1 Approve the release
     [ ]  0 Don't care
     [ ] -1 Don't release, because ...

 Items in this release

Contributions:

Shawn Smith
"Race condition with QueryEngineRegistry and
UpdateEngineRegistry init()"
   https://issues.apache.org/jira/browse/JENA-2356

Ali Ariff
"Labeling for Blank Nodes Across Writers"
   https://github.com/apache/jena/issues/1997

sszuev
"jena-core: add more javadocs about Graph-mem thread-safety and 
ConcurrentModificationException"

   https://github.com/apache/jena/pull/1994

sszuev
GH-1419: fix DatasetGraphMap#clear
   https://github.com/apache/jena/issue/1419

sszuev
GH-1374: add copyWithRegisties Context helper method
   https://github.com/apache/jena/issue/1374

---

Key upgrades

org.apache.lucene : 9.5.0 -> 9.7.0
org.apache.commons:commons-lang3: 3.12.0 -> 3.13.0
org.apache.sis.core:sis-referencing : 1.1 -> 1.4

 Jena 5

Jena 4.10.0 is the last planned release of Jena 4.x.x

There are deprecations to indicate functionality to be removed in Jena5.

Jena5 will require Java17.

 Release Vote

Everyone, not just committers, is invited to test and vote.
Please download and test the proposed release.

Staging repository:
   https://repository.apache.org/content/repositories/orgapachejena-1060

Proposed dist/ area:
   https://dist.apache.org/repos/dist/dev/jena/

Keys:
   https://svn.apache.org/repos/asf/jena/dist/KEYS

Git commit (browser URL):
   https://github.com/apache/jena/commit/21500eeb1b

Git Commit Hash:
   21500eeb1b616b6bc370e6c900a3e027b37763c7

Git Commit Tag:
   jena-4.10.0

This vote will be open until at least

   Friday, 27th October 2023 at 18:00 UTC

If you expect to check the release but the time limit does not work
for you, please email within the schedule above.

     Thanks,
     Andy

Checking:

+ are the GPG signatures fine?
+ are the checksums correct?
+ is there a source archive?
+ can the source archive be built?
   (NB This requires a "mvn install" first time)
+ is there a correct LICENSE and NOTICE file in each artifact
   (both source and binary artifacts)?
+ does the NOTICE file contain all necessary attributions?
+ have any licenses of dependencies changed due to upgrades?
    if so have LICENSE and NOTICE been upgraded appropriately?
+ does the tag/commit in the SCM contain reproducible sources?


[VOTE] Apache Jena 4.10.0 RC 1

2023-10-24 Thread Andy Seaborne

Hi,

Here is a vote on the release of Apache Jena 4.10.0.
This is the first release candidate.

The deadline is

Friday, 27th October 2023 at 18:00 UTC

Please vote to approve this release:

[ ] +1 Approve the release
[ ]  0 Don't care
[ ] -1 Don't release, because ...

 Items in this release

Contributions:

Shawn Smith
"Race condition with QueryEngineRegistry and
UpdateEngineRegistry init()"
  https://issues.apache.org/jira/browse/JENA-2356

Ali Ariff
"Labeling for Blank Nodes Across Writers"
  https://github.com/apache/jena/issues/1997

sszuev
"jena-core: add more javadocs about Graph-mem thread-safety and 
ConcurrentModificationException"

  https://github.com/apache/jena/pull/1994

sszuev
GH-1419: fix DatasetGraphMap#clear
  https://github.com/apache/jena/issue/1419

sszuev
GH-1374: add copyWithRegisties Context helper method
  https://github.com/apache/jena/issue/1374

---

Key upgrades

org.apache.lucene : 9.5.0 -> 9.7.0
org.apache.commons:commons-lang3: 3.12.0 -> 3.13.0
org.apache.sis.core:sis-referencing : 1.1 -> 1.4

 Jena 5

Jena 4.10.0 is the last planned release of Jena 4.x.x

There are deprecations to indicate functionality to be removed in Jena5.

Jena5 will require Java17.

 Release Vote

Everyone, not just committers, is invited to test and vote.
Please download and test the proposed release.

Staging repository:
  https://repository.apache.org/content/repositories/orgapachejena-1060

Proposed dist/ area:
  https://dist.apache.org/repos/dist/dev/jena/

Keys:
  https://svn.apache.org/repos/asf/jena/dist/KEYS

Git commit (browser URL):
  https://github.com/apache/jena/commit/21500eeb1b

Git Commit Hash:
  21500eeb1b616b6bc370e6c900a3e027b37763c7

Git Commit Tag:
  jena-4.10.0

This vote will be open until at least

  Friday, 27th October 2023 at 18:00 UTC

If you expect to check the release but the time limit does not work
for you, please email within the schedule above.

Thanks,
Andy

Checking:

+ are the GPG signatures fine?
+ are the checksums correct?
+ is there a source archive?
+ can the source archive be built?
  (NB This requires a "mvn install" first time)
+ is there a correct LICENSE and NOTICE file in each artifact
  (both source and binary artifacts)?
+ does the NOTICE file contain all necessary attributions?
+ have any licenses of dependencies changed due to upgrades?
   if so have LICENSE and NOTICE been upgraded appropriately?
+ does the tag/commit in the SCM contain reproducible sources?


Re: [Lazy] Jena5 Branch

2023-10-22 Thread Andy Seaborne




On 22/10/2023 11:50, Bruno Kinoshita wrote:


Starting to provide a format, then stopping, is not very helpful.
CycloneDX is easier to produce and has more uptake in ASF.



I had a look but couldn't find anything conclusive on which format works
best for the EU Cyber Resilience Act.


CRA is a lot of other issues for open source :-(


GitHub is exporting SPDX I think:
https://github.blog/2023-03-28-introducing-self-service-sboms/



Useful.
Now we have two to choose between :-)

As indented JOSN they are:

SPDX plugin:
  1,516,541 chars

GH generated : gh sbom | jq
626,773 chars

Andy


You can create one for Jena from
https://github.com/apache/jena/network/dependencies and that will give you
an SPDX JSON.
Combining SPDX with RAT could be useful.




#TIL! I think RAT had/has some older issues (can't recall if in the tool,
maven plugin, or both) but had a low activity. Maybe with that there will
be more commits/releases.

Links I have found useful:




Thanks for the links to external and ASF material! Someone shared links in
the Commons security list too about SBOM discussing VEX files (OSV was also
mentioned):

- https://www.cisa.gov/sbom
-
https://www.cisa.gov/sites/default/files/2023-04/minimum-requirements-for-vex-508c.pdf
- https://github.com/openvex (

  PS SPDX can be RDF!, and in fact the maven plugin uses Jena!

Jena 3.10.0 :-(



Maybe we can ping someone that maintains it, or even send a PR to bump it
to Jena 4, warning that there will be a jena5 soon too.

Cheers,

Bruno


Re: [Lazy] Jena5 Branch

2023-10-22 Thread Andy Seaborne




On 21/10/2023 22:51, Bruno Kinoshita wrote:

Thanks Andy!

I had a go at the UI dependencies upgrade, and found some deprecation
warnings (from vite I think) and e2e tests that need to be fixed. I'm doing
those tasks for the jena5 branch.


Great - thank you.

It's time to get 4.10.0 out and switch over.


Will also try to look at the BOM issues as I may need that for $work
(future EU regulations and all).


tl;dr:

Let's publish CycloneDX and hold back on SPDX for now.
There's a lot going on in ASF and the picture will become clearer.
UI don't think Jena is special or different in its requirements.

Starting to provide a format, then stopping, is not very helpful.
CycloneDX is easier to produce and has more uoptake in ASF.

The US gov accepts CycloneDX as well as SPDX and Software Identification 
(SWID) tag.


I'd be surprised if the EU does not align,




SPDX is quite detailed. It was originally for license management. I'm 
begining to think it is less useful for simple machine generation and 
expects manual configuration to at least check all it's deductions, and 
probably change them.  Having some coverage of license information but 
not full coverage seems like a bad idea for both us and users.


Interestingly, RAT has a class "SpdxBuilder".
Combining SPDX with RAT could be useful.

In ASF, only Commons is producing SPDX that I can find.

Links I have found useful:

https://www.activestate.com/blog/why-the-us-government-is-mandating-software-bill-of-materials-sbom/

IN ASF:
https://cwiki.apache.org/confluence/display/COMDEV/SBOM

Discussion on
https://github.com/apache/logging-log4j2/issues/1707
 -- worth tracking

and e.g.

https://github.com/apache/spark/pull/39401
  Dongjoon Hyun has been doing quite a few of the PRs
  for adding CylconeDX to projects so his work is getting
  wide review.

Andy

PS SPDX can be RDF!, and in fact the maven plugin uses Jena!
Jena 3.10.0 :-(



Cheers,

Bruno

On Fri, 20 Oct 2023 at 11:56, Andy Seaborne  wrote:




On 19/10/2023 22:21, Bruno Kinoshita wrote:

Great progress Andy!

I saw that you created several issues for Jena5.


Sorry - because it's a branch, github hasn't closed them when the PR was
merged.

https://github.com/apache/jena/issues?q=is%3Aissue+is%3Aopen+label%3AJena5

should make things clearer

There's always a lot of things that would be nice but that then delays
the release.

I'm going through my notes and I'll raise issues.

+ There is one "must" change: normalization of language tags.

https://github.com/apache/jena/issues/2039

because that impacts on-disc data.

+ The SBOM SPDX files don't look very good - too many NOASSERTION.


https://repository.apache.org/content/groups/snapshots/org/apache/jena/jena-arq/5.0.0-SNAPSHOT/jena-arq-5.0.0-20231018.142515-1.spdx.json

but maybe that is just how it is. I'm not sure what "good practice" in
ASF is or what "good practice" is generally (e.g. SBOMs for every
artifact is best or are they just clutter?).

Many projects produce CycloneDX files but not SPDX.

  > Are there any easy ones that you need help with?

2048 maybe

Should we do a general update of dependencies in FusekiUI?

  Andy


Cheers
Bruno

On Wed, 18 Oct 2023 at 17:15, Andy Seaborne  wrote:




On 12/10/2023 10:05, Andy Seaborne wrote:


On 06/10/2023 11:47, Andy Seaborne wrote:

There's a large PR for a new branch "jena5"

  https://github.com/apache/jena/pull/2029

of what I've managed to do so far.

It's not finished.

   Andy


I'd like to bring the PR in as a branch and setup Jenkins to produce
snapshot artifacts.


Branch setup, code merged to branch "jena5"

There will be forced pushes due to rebasing to "main".

This will end when Jena 4.10.0 is released which makes a nice, clear
point at which to create a jena4 and make main jena5 development.

There are one or two items that need to go into 4.10.0 ebfore that can
be released.

Jenkins is deploying 5.0.0-SNAPSHOT to the Apache snapshots repository.

https://repository.apache.org/content/repositories/snapshots/

   Andy









Re: [Lazy] Jena5 Branch

2023-10-20 Thread Andy Seaborne




On 19/10/2023 22:21, Bruno Kinoshita wrote:

Great progress Andy!

I saw that you created several issues for Jena5.


Sorry - because it's a branch, github hasn't closed them when the PR was 
merged.


https://github.com/apache/jena/issues?q=is%3Aissue+is%3Aopen+label%3AJena5

should make things clearer

There's always a lot of things that would be nice but that then delays 
the release.


I'm going through my notes and I'll raise issues.

+ There is one "must" change: normalization of language tags.

https://github.com/apache/jena/issues/2039

because that impacts on-disc data.

+ The SBOM SPDX files don't look very good - too many NOASSERTION.

https://repository.apache.org/content/groups/snapshots/org/apache/jena/jena-arq/5.0.0-SNAPSHOT/jena-arq-5.0.0-20231018.142515-1.spdx.json

but maybe that is just how it is. I'm not sure what "good practice" in 
ASF is or what "good practice" is generally (e.g. SBOMs for every 
artifact is best or are they just clutter?).


Many projects produce CycloneDX files but not SPDX.

> Are there any easy ones that you need help with?

2048 maybe

Should we do a general update of dependencies in FusekiUI?

Andy


Cheers
Bruno

On Wed, 18 Oct 2023 at 17:15, Andy Seaborne  wrote:




On 12/10/2023 10:05, Andy Seaborne wrote:


On 06/10/2023 11:47, Andy Seaborne wrote:

There's a large PR for a new branch "jena5"

 https://github.com/apache/jena/pull/2029

of what I've managed to do so far.

It's not finished.

  Andy


I'd like to bring the PR in as a branch and setup Jenkins to produce
snapshot artifacts.


Branch setup, code merged to branch "jena5"

There will be forced pushes due to rebasing to "main".

This will end when Jena 4.10.0 is released which makes a nice, clear
point at which to create a jena4 and make main jena5 development.

There are one or two items that need to go into 4.10.0 ebfore that can
be released.

Jenkins is deploying 5.0.0-SNAPSHOT to the Apache snapshots repository.

https://repository.apache.org/content/repositories/snapshots/

  Andy





Towards Jena 4.10.0

2023-10-19 Thread Andy Seaborne

At the moment:
  https://s.apache.org/jena-4.10.0-issues

jena-4.10.0 has 20 closed issues and 42 PRs

There is still some sorting and PR catch-up out to do.

Jena 4.10.0 still has a minimum requirement of Java11, not Java17.

Jena 4.9.0 was 2023-07-08.

Andy


Re: [Lazy] Jena5 Branch

2023-10-18 Thread Andy Seaborne




On 12/10/2023 10:05, Andy Seaborne wrote:


On 06/10/2023 11:47, Andy Seaborne wrote:

There's a large PR for a new branch "jena5"

    https://github.com/apache/jena/pull/2029

of what I've managed to do so far.

It's not finished.

 Andy


I'd like to bring the PR in as a branch and setup Jenkins to produce 
snapshot artifacts.


Branch setup, code merged to branch "jena5"

There will be forced pushes due to rebasing to "main".

This will end when Jena 4.10.0 is released which makes a nice, clear 
point at which to create a jena4 and make main jena5 development.


There are one or two items that need to go into 4.10.0 ebfore that can 
be released.


Jenkins is deploying 5.0.0-SNAPSHOT to the Apache snapshots repository.

https://repository.apache.org/content/repositories/snapshots/

Andy


Re: [Lazy] Jena5 Branch

2023-10-12 Thread Andy Seaborne




On 12/10/2023 10:40, Bruno Kinoshita wrote:
...

Given that I believe most of the Jena development should now be focused on
Jena5, wouldn't it make more sense to create a Jena4 branch, merge Jena5
branch into main, and backport bug fixes to the Jena4 branch as needed?

I think we might even be able to cut releases from that branch.


The maven release plugin should work on a branch.


That way, I think we could say that the official version under development
is Jena5, and Jena4 is now in hotfix maintenance, until Jena5 is released
(plus whatever time we need/can to support it in the future).


Good point about the showing jena5 is the "official version under 
development".


Since 4.9.0, there are about 18 closed non-Jena5 issues, and 37 PRs 
mostly dependency upgrades.


https://s.apache.org/jena-4.10.0-issues.

I think we should do 4.10.0 as normal (which is "soon"ish), wait a bit 
to make sure nothing horrendous turns up, then switch. It creates space 
for 5.0.0 in the release cycle.


That becomes the official split point jena4 and jena5. No more rebasing 
jena4 onto jena5!


5.0.0 might be a -beta or -M1 or -rc1, though I'm not sure how much take 
up they will at our scale. There are changes which will slow switch 
over, but other than that it's at the same usability level of 4.x.x.


"main" is protected - no forced pushes - so seeing Jena5 hasn't got some 
that it is reasonably stable, has been building SNAPSHOTs and has been used.


Andy



Cheers

Bruno



On Thu, 12 Oct 2023 at 11:05, Andy Seaborne  wrote:



On 06/10/2023 11:47, Andy Seaborne wrote:

There's a large PR for a new branch "jena5"

 https://github.com/apache/jena/pull/2029

of what I've managed to do so far.

It's not finished.

  Andy


I'd like to bring the PR in as a branch and setup Jenkins to produce
snapshot artifacts.

The branch might still liable to force pushes to keep the history
comprehensible, such as rebasing it to main, and finally when switching
to this branch to be  main if we use rebase and merge.

I think having a baseline for people to look at and maybe even try out,
is better than waiting until the very last minute to become Jena5.

Maybe we should use rebase and merge" for PRs from now on?

  Andy





[Lazy] Jena5 Branch

2023-10-12 Thread Andy Seaborne



On 06/10/2023 11:47, Andy Seaborne wrote:

There's a large PR for a new branch "jena5"

    https://github.com/apache/jena/pull/2029

of what I've managed to do so far.

It's not finished.

     Andy


I'd like to bring the PR in as a branch and setup Jenkins to produce 
snapshot artifacts.


The branch might still liable to force pushes to keep the history 
comprehensible, such as rebasing it to main, and finally when switching 
to this branch to be  main if we use rebase and merge.


I think having a baseline for people to look at and maybe even try out, 
is better than waiting until the very last minute to become Jena5.


Maybe we should use rebase and merge" for PRs from now on?

Andy


[Draft] Apache Jena - October 2023

2023-10-12 Thread Andy Seaborne

## Description:

The mission of Jena is the creation and maintenance of software related 
to Java framework for building Semantic Web applications


## Project Status:
Current project status: Ongoing
Issues for the board: None

## Membership Data:
Apache Jena was founded 2012-04-18 (11 years ago)
There are currently 19 committers and 13 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:4.

Community changes, past quarter:
- No new PMC members. Last addition was Aaron Coburn on 2019-01-22.
- Arne Bernhardt was added as committer on 2023-07-11

## Project Activity:
Development is now around Jena 5, using the major version change for 
both external changes and code improvements.


External changes include building convenience binaries for Java17 in 
keeping with the project supporting two Java LTS; switching from 
javax.servlet to jakarta.servlet); update to Eclipse Jetty12; and 
removing a dependency from a project that is no longer active.


Project development for Jena5 includes removing deprecated code and 
tidying up. There is a new standards compliant RDF/XML parser which is 
both faster and easier to maintain.


## Community Health:
The community continues to answer questions on the users list. The dev 
list has been quieter because the project has moved some more automated 
email off that list, general seasonal effects, and because the Jena5 
development has proceeded on github.


Re: Proposed changes for Jena5

2023-10-06 Thread Andy Seaborne

There's a large PR for a new branch "jena5"

   https://github.com/apache/jena/pull/2029

of what I've managed to do so far.

It's not finished.

Andy


jena-jdbc [Was: Preparing for Jena5]

2023-09-25 Thread Andy Seaborne

For Jena5, it might be a good opportunity to retire jena-jdbc.

The code is mostly update to jena5 - there is some work remaining 
because of Jetty12 in the tests for jena-jdbc-drive-remote. So the 
retired code would be ready to bring back, or at least quite close.


Andy



Re: Proposed changes for Jena5

2023-09-01 Thread Andy Seaborne




On 31/08/2023 19:25, Andy Seaborne wrote:


RRX is actually 2 parsers :-).

One is SAX based, and handles XML entities. The other is StAX based; it 
first written as a learning exercise. The StAX API does not support XML 
entities. 


Correction - the StAX API does support character entities.

It was just that Jena has a default of disabling all DTD and entity 
features off for security reasons. External entities must be disabled.


Andy


Proposed changes for Jena5

2023-08-31 Thread Andy Seaborne

Here is the status of  my work on Jena5.

These are changes done on a branch in my development repo. I'm going to 
raise issues for each of the these changes and give them all the right 
GH- commit message, then propose a Jena repo branch.


There's a note about the RDF/XML parser below.

 Completed

== Set version to 5.0.0-SNAPSHOT

== Build set to Java17
  Upgrade graalvm dependency (test) GraaVM now requires Java17.

== Rename javax.servlet -> jakarta.servlet
  Update to jetty11

== Node clear-up
- general review and simplification
- Remove BlankNodeId as indexing label from Node_Blank
- LiteralLabel
   Convert LiteralLabel to a class
   Remove use from APIs
 (mostly) - RDFDatatype still reference it but
 I'm not clear why it doesn't use Node_Literal.
   Rework LiteralLabel as term-centric as well as value-centric [1]

== Remove old and partial RDF 1.0 code
   (it was used inconsistently)

== Move ModelMaker into ontology area (it is only used in ont)

== Model API and Model impl
- Remove deprecated
- Remove isXML/isWellFormed from APIs (seems to be meaningless)
- Simplify containers iterators (implementation)
- Remove TripleBoundary, StatementBoundary, GraphExtract, ModelExtract
Not used by jena-core.
- Remove Selector (already deprecated and unused)
- Remove deprecated: ResourceF
- RDFReaderF and RDFWriterF
  Remove the unnamed language operations which are RDF/XML.
  Deprecate the named language forms in Model.
- Remove reification (interface methods were, mostly, deprecated)

== Add Jena BOM module

== Update to SLF4j 2.x

== Remove unused assemblers.

== Remove JSON-LD 1.0 support

 TO DO:

Update for Jetty12

Switch to term graphs.

 Desirable

Replace normal usage of the RDF/XML reader with something more 
maintainable. [2]


= Reorgs

Call TDB1 "tdb1"
- Rename artifact jena-tdb as jena-tdb1.
- Move the package tree to org.apache.jenba.tdb1
   Leave legacy API at "org.apache.jena.tdb"
"org.apache.jena.tdb.TDBFactory" -> "org.apache.jena.tdb1.TDB1Factory"

Andy

[1] LiteralLabel

The idea of LiteralLabel changes is to keep work off the critical part 
of creating and streaming literals and only creating the value if 
required. The "value" here is the Model API Java type support and the 
current GraphMem indexing value.


Ideally, I'd like to pull LiteralLabel into Node_Literal and not have a 
separate class but that may be a step too far.


[2] The jena-core RDF/XML reader (ARP) in oaj.rdfxml.xmlinput and 
oaj.rdfxml.xmlinput0 packages are complicated.


PR 1774 changed ARP to use the system IRIx interface, not call jena-iri 
directly. And the original ARP is also available. 1774 did some cleanup 
but was quite conservative in that.


https://github.com/apache/jena/pull/1774

ARP has lots of features and it is clear it was developed while RDF/XML 
was being originally spec'ed. There are features and warnings that 
aren't in the spec. It does not integrate with the RIOT parser builder 
very well.


I tried to do a clean-up but I've come to the conclusion it is 
better/safer to keep ARP as it is after 1774, and write a new RDF/XML 
parser (RRX - RIOT RDF/XML parser) with the design goal of being just an 
application/rdf+xml parser.


The existing ARP would remain in jena-core. Testing the new parser is 
done with "run ARP, runRRX" then test whether the outputs, including 
occurrence of warnings, are the same. The W3C test suite has mandated 
warnings. ARP goes further.  The order of triple output is also the same 
(expect reification where the APR output is backwards!)


RRX is actually 2 parsers :-).

One is SAX based, and handles XML entities. The other is StAX based; it 
first written as a learning exercise. The StAX API does not support XML 
entities. SAX is a stream of parser events and requires the code to have 
a coded state machine; StAX uses function call descent to know where in 
the grammar it is which is easier to understand.


They should produce identical output, down to triple order and messages.

RRX-SAX would be the one that is normally used from RIOT. RRX-StAX is a 
"stay honest".


ARP is 66 java files. Each RRX parser is one file.

RRX should work with any XML parser because they don't make any 
assumptions about optional supported XML parsing features. Development 
has been with the JDK internal one.


RE: Mailing list threading improvements

2023-08-17 Thread Andy Seaborne

This is an improvement!

On 2023/08/17 08:27:39 Christofer Dutz wrote:

TL;DR: We’re updating how auto-generated email from Github will be
threaded on your mailing lists. If you want to keep the old defaults,
details are below.

We’re pleased to let you know that we’re tweaking the way that auto-
generated email from Github will appear on your mailing lists. This
will lead to more human-readable subject lines, and the ability of most
modern mail clients to correctly thread discussions originating on
Github.

Background: Many project mailing lists receive email auto-generated by
Github. The way that the subject lines are crafted leads to messages
from the same topic not being threaded together by most mail clients.
We’re fixing that.

The way that these messages are threaded is defined by a file -
.asf.yml - in your git repositories. We’re changing the way that it
will work by default if you don’t choose settings. If you’re happy for
us to make this change, don’t do anything - the change will happen on
October the 1st 2023.

Details of the current default, as well as the proposed changes, are on
the following page, along with instructions on how to keep your current
settings, if you prefer:

https://community.apache.org/contributors/mailing-lists.html#configuring-the-subject-lines-of-the-emails-being-sent

Please copy d...@community.apache.org
on any feedback.

Chris, on behalf of the Comdev PMC



Re: Preparing for Jena5 - API deprecations

2023-07-24 Thread Andy Seaborne

Another item for deprecation-removal.

Model.query(Selector) and all the Selector code.

Nowadays, JDK java can cleanly do filtering and there is SPARQL for 
anything more complex.


Andy


Reification [Was: Preparing for Jena5 - API deprecations]

2023-07-22 Thread Andy Seaborne
Reification is only supported in the Model API, not Graph. It's already 
simpler than it was when first introduced, when it had 3 different modes.


The complexity on storage was huge.

https://www.hpl.hp.com/techreports/2003/HPL-2003-266.pdf

Reification subsequently got simplified to library code in the Model API 
which corresponds to the original "Reification standard mode".


https://jena.apache.org/documentation/notes/reification_previous.html

See ReifierStd, which is all static functions. (There are no 
"hiddenTriples" - that was an old feature)


   Andy


[ANNOUNCE] Arne Bernhardt elected as Committer

2023-07-11 Thread Andy Seaborne
The Apache Jena PMC have invited Arne Bernhardt to become a committer 
and we are pleased to announce that he has accepted.


Arne has made contributions of new implementations of Jena in-memory 
graphs stores, as well as improvements to the existing GraphMem. These 
are supported by detailed analysis of the performance and memory 
footprint all the designs and also benchmark testing in the build.


"Committer" recognizes commitment to the project. If you would like to 
learn more please see


   http://apache.org/foundation/how-it-works.html#roles

Please join us in welcoming Arne as a committer.

Andy


[Draft] Apache Jena - July 2023

2023-07-09 Thread Andy Seaborne

Draft board report

## Description:

The mission of Jena is the creation and maintenance of software related 
to Java framework for building Semantic Web applications


## Project Status:

Current project status: Ongoing
Issues for the board: none

[[
"project status" is a new required item. The options are:
  * New: a top-level project that's just getting started
  * Ongoing: With high, moderate or low activity, which you may quantify
if appropriate
  * Dormant: Not much happening on the code, but at least 3 PMC members
ready to engage if needed
  * At risk: Not enough active PMC members, or a significant number of
contributors left the project, etc.
  * Considering moving to the Attic: a project that's about to move to
the Attic, or discussing that
]]


## Membership Data:

Apache Jena was founded 2012-04-18 (11 years ago)
There are currently 18 committers and 13 PMC members in this project.
The Committer-to-PMC ratio is roughly 9:7.

Community changes, past quarter:
- No new PMC members. Last addition was Aaron Coburn on 2019-01-22.
- No new committers. Last addition was Greg Albiston on 2019-07-08.

## Project Activity:

The project released version 4.8.0 on April 23, 2023
and 4.9.0 on July 8, 2023.
Both releases included addressing security issues.

The project is discussing version 5.0.0. There are two external changes 
in the Java ecosystem that affect the project - a new LTS version (the 
project policy is to support the last two LTS versions of Java) and the 
J2EE javax to jakarta package transition. The project may make other 
incompatible changes that affect Jena users who use the project as a 
code library.


## Community Health:

Activity levels are normal.

One part of 4.9.0 is a significant contribution to re-implement the 
in-memory graphs. At the same time, the new implementations follow the 
W3C standards as closely as possible. In 4.9.0, the new implementations 
are "opt-in". Whether they become the default at 5.0.0 is not yet decided.


Re: [] [] Apache Jena 4.9.0 RC1

2023-07-08 Thread Andy Seaborne

I'll release the build outputs and send the ANN.

Some of completing the release will have to wait until tomorrow.

Andy

On 08/07/2023 21:58, Andy Seaborne wrote:
The VOTE passes with PMC votes from Bruno, Rob and Andy, together votes 
from Arne and Marco.


     Andy


[RESULT] [VOTE] Apache Jena 4.9.0 RC1

2023-07-08 Thread Andy Seaborne
The VOTE passes with PMC votes from Bruno, Rob and Andy, together votes 
from Arne and Marco.


Andy

On 04/07/2023 20:22, Andy Seaborne wrote:

Hi,

Here is a vote on the release of Apache Jena 4.9.0.
This is the first release candidate.

The deadline is

     Saturday, 8th July 2023 at 05:00 UTC

Please vote to approve this release:

     [ ] +1 Approve the release
     [ ]  0 Don't care
     [ ] -1 Don't release, because ...


Re: [VOTE] Apache Jena 4.9.0 RC1

2023-07-04 Thread Andy Seaborne

+1

Andy

On 04/07/2023 20:22, Andy Seaborne wrote:

Hi,

Here is a vote on the release of Apache Jena 4.9.0.
This is the first release candidate.

The deadline is

     Saturday, 8th July 2023 at 05:00 UTC

Please vote to approve this release:

     [ ] +1 Approve the release
     [ ]  0 Don't care
     [ ] -1 Don't release, because ...


[VOTE] Apache Jena 4.9.0 RC1

2023-07-04 Thread Andy Seaborne

Hi,

Here is a vote on the release of Apache Jena 4.9.0.
This is the first release candidate.

The deadline is

Saturday, 8th July 2023 at 05:00 UTC

Please vote to approve this release:

[ ] +1 Approve the release
[ ]  0 Don't care
[ ] -1 Don't release, because ...

 Items in this release

Arne Berdhardt
https://github.com/apache/jena/issues/1912
New implementations of in-memory graphs with better storage and performance.

See the issue for performance details.

See GraphMemFactory for access to these new graph implementations.

Arne has also provided a performance analysis and improvements for the 
existing default in-memory graphs together with a benchmarking framework

  https://github.com/apache/jena/pull/1279

--

Switch from TriplyDB/(yasr,yasqe) to zazuko/(yasr,yasqe)
to pick up fixes.
Thank you Zazuko!

--

SERVICE on/off control
https://github.com/apache/jena/pull/1906

Provide the ability to switch off all SERVICE processing completely.
Use
  Code: arq:httpServiceAllowed
  or http://jena.apache.org/ARQ#httpServiceAllowed=false
to disable.

e.g.
  fuseki-server --set arq:httpServiceAllowed=false 

--

Additional restrictions and control for SPARQL script functions
  https://github.com/apache/jena/pull/1908

There is a new Jena context setting
  http://jena.apache.org/ARQ#scriptAllowList
which is on the command line:
  arq:scriptAllowList
and java constant
  ARQ.symCustomFunctionScriptAllowList

Its value is a comma separated list of function names.
  "function1,function2"
Only the functions in this can be called from SPARQL.

As in Jena 4.8.0, the Java system property "jena:scripting" must also be 
set to "true" to enable script functions.

  Website (when published):
   https://jena.apache.org/documentation/query/javascript-functions

--

Prepare for Jena5:
  Deprecate  JSON-LD 1.0 constants
  Deprecate  API calls that may be removed.

--

Specific SPARQL 1.2 parser, tracking the RDF-star working group.
  All features are also available in the default SPARQL parser.

--
Ryan Shaw(@rybesh)
  new Turtle RDFFormat
  https://github.com/apache/jena/issues/1924
--
Simon Bin (@SimonBin)
  A fix for incorrect integer cast in scripting.NV
  https://github.com/apache/jena/pull/1851
--
Alexander Ilin-Tomich (@ailintom)
  Fix for SPARQL_Update verification and /HTTP PATCH
--
Ryan Shaw (@rybesh)
  Script fix for additional classpath elements
  https://github.com/apache/jena/pull/1877
--
FusekiModules:
Issue: https://github.com/apache/jena/issues/1897

There is a change in that the interface for automatically loading 
modules from the classpath has changed to FusekiAutoModule, The 
interface FusekiModule is now the configuration lifecycle only. This is 
to allow for programmatically set up a Fuskei server with Fuseki 
modules, including custom one from the calling application.


===
 Release Vote

Everyone, not just committers, is invited to test and vote.
Please download and test the proposed release.

Staging repository:
  https://repository.apache.org/content/repositories/orgapachejena-1059

Proposed dist/ area:
  https://dist.apache.org/repos/dist/dev/jena/

Keys:
  https://svn.apache.org/repos/asf/jena/dist/KEYS

Git commit (browser URL):
  https://github.com/apache/jena/commit/84aa91e095

Git Commit Hash:
  84aa91e095e20e0e3c7a55c9780f285ef8fb54bb

Git Commit Tag:
  jena-4.9.0

This vote will be open until at least

  Saturday, 8th July 2023 at 05:00 UTC

If you expect to check the release but the time limit does not work
for you, please email within the schedule above.

Thanks,
Andy

Checking:

+ are the GPG signatures fine?
+ are the checksums correct?
+ is there a source archive?
+ can the source archive be built?
  (NB This requires a "mvn install" first time)
+ is there a correct LICENSE and NOTICE file in each artifact
  (both source and binary artifacts)?
+ does the NOTICE file contain all necessary attributions?
+ have any licenses of dependencies changed due to upgrades?
   if so have LICENSE and NOTICE been upgraded appropriately?
+ does the tag/commit in the SCM contain reproducible sources?


Re: Preparing for Jena5 - API deprecations

2023-07-04 Thread Andy Seaborne

Hi Andrew,

None - it only affects Fuseki.

While it is generally referred to as the javax.* transition, it is not 
all of java. It's the javax that are part of J2EE. Other javax are in 
the JDK.


https://blogs.oracle.com/javamagazine/post/transition-from-java-ee-to-jakarta-ee

There is one I haven't looked at yet: javax.xml.bind -> jakarta.xml.bind
which is local to org/apache/jena/datatypes/xsd/XSDhexBinary.java

There may be dependency changes that have an effect if an app gets them 
recursively via Jena.


For Fuseki:
javax.servlet.* -> jakarata.servlet.*

Here's a commit that makes the change

https://github.com/afs/jena/commit/c91abd94562d4c508ee0deedda3ed9f4d872a818

The only non-Fuseki changes are in:

jena-integration-tests/src/test/java/org/apache/jena/test/conn/StringHolderServlet.java
-- defines servlet

jena-permissions/pom.xml
-- because of shiro

pom.xml
-- version of Jetty, shiro dependency management

Andy

On 03/07/2023 19:32, Andrii Berezovskyi wrote:

Hello Andy,

May I ask if there is any impact on the non-Fuseki users of Jena in regard to the 
planned javax.* -> jakarata.* migration?

–Andrew.

On 3 Jul 2023, at 14:57, Andy Seaborne  wrote:

So far we have:

1/ Java21 is due to be released September 2023 and be a LTS release.
2/ javax.* -> jakarata.*
3/ Drop a separate JSON-LD 1.0 subsystem.
4/ Term graphs

One more thing I'd like to suggest for Jena5 is simplification. Look for 
code/features that are now out of date because of where the standards have gone.

Two are:

A/ LiteralLabel

It may be possible to merge this Node_Literal itself which, together with 
generally simplifying the Node hierarchy, makes the system
There are what look like matters from RDF 1.0 WG in the code; RDF 1.1 makes RDF 
Terms simpler and clearer.

While this is in an "impl" package, it also features in some Model API calls.

B/ The "is well formed" flag ... also called "isXML" in some places at the node 
level despite the fact it is used for things other than XML. This does not need to be done when 
creating Node_Literals.

With term graphs, and parsing, value evaluations checking isn't required all 
the time but it adds costs to the critical path.

There is a control
  JenaParameters.enableEagerLiteralValidation
which is false and which controls how to respond to bad literals.


To allow for A and B, I'd like to deprecate API calls that involve them. It may 
turn out some parts need to be kept - I've only done an initial pass over the 
code - but I think it is better to warn now and not simply put in changes at 
Jena5 with no advance notice.

Andy




Preparing for Jena5 - API deprecations

2023-07-03 Thread Andy Seaborne

So far we have:

1/ Java21 is due to be released September 2023 and be a LTS release.
2/ javax.* -> jakarata.*
3/ Drop a separate JSON-LD 1.0 subsystem.
4/ Term graphs

One more thing I'd like to suggest for Jena5 is simplification. Look for 
code/features that are now out of date because of where the standards 
have gone.


Two are:

A/ LiteralLabel

It may be possible to merge this Node_Literal itself which, together 
with generally simplifying the Node hierarchy, makes the system
There are what look like matters from RDF 1.0 WG in the code; RDF 1.1 
makes RDF Terms simpler and clearer.


While this is in an "impl" package, it also features in some Model API 
calls.


B/ The "is well formed" flag ... also called "isXML" in some places at 
the node level despite the fact it is used for things other than XML. 
This does not need to be done when creating Node_Literals.


With term graphs, and parsing, value evaluations checking isn't required 
all the time but it adds costs to the critical path.


There is a control
  JenaParameters.enableEagerLiteralValidation
which is false and which controls how to respond to bad literals.


To allow for A and B, I'd like to deprecate API calls that involve them. 
It may turn out some parts need to be kept - I've only done an initial 
pass over the code - but I think it is better to warn now and not simply 
put in changes at Jena5 with no advance notice.


Andy



Re: Towards Jena 4.9.0

2023-07-01 Thread Andy Seaborne

There have been another report of this problem on stackoverflow.

All I can think of is that Jena 4.8.0 had an upgrade from Log4j 2.19 to 
2.20 and the way URL are treated got pickier (the other report also havs 
\ in JENA_HOME.


Andy

On 29/06/2023 09:27, Andy Seaborne wrote:

There's a small change to the bat scripts:

- set LOGGING=file:%JENA_HOME%/log4j2.properties
+ set LOGGING=%JENA_HOME%/log4j2.properties

which seems to help but which I can't reliably tests, not being a 
windows user.


Could someone please check this change doesn't break some other pattern 
of use?


     Andy

PR
https://github.com/apache/jena/pull/1916/
from issue
https://github.com/apache/jena/issues/1911

Lots of files change because is in the template and the bat files are 
regenerated.




In-memory graphs

2023-06-30 Thread Andy Seaborne
3 new in-memory graph implementations s have just been merged into the 
code base.


https://github.com/apache/jena/issues/1912

Please try them out.

The new graphs are "same term", not "same value" and do not support 
Iterator.remove; this is the same as persistent graphs and the 
transactional in-memory graphs.


The idea is that Jena switches to consist behaviour through out all 
implementations.


To try them out get a 4.9.0 development build (from today) or build from 
source and then enable with:


Jvm:

  -Djena:graphSameTerm=true

or command line

  JVM_ARGS="-Djena:graphSameTerm=true" some_cmd ...

or in Java code

  GraphMemFactory.setDftGraphSameTerm(true);

This affects the Model, Inf and Ontology APIs, when sued with the 
current default choice of GraphMem. It has much less affect for SPARQL 
and Fuseki, which use term graph except for the general dataset used ot 
combine different models.


Andy


Re: Towards Jena 4.9.0

2023-06-29 Thread Andy Seaborne

There's a small change to the bat scripts:

- set LOGGING=file:%JENA_HOME%/log4j2.properties
+ set LOGGING=%JENA_HOME%/log4j2.properties

which seems to help but which I can't reliably tests, not being a 
windows user.


Could someone please check this change doesn't break some other pattern 
of use?


Andy

PR
https://github.com/apache/jena/pull/1916/
from issue
https://github.com/apache/jena/issues/1911

Lots of files change because is in the template and the bat files are 
regenerated.




Re: Towards Jena 4.9.0

2023-06-23 Thread Andy Seaborne

On 23/06/2023 14:06, Arne Bernhardt wrote:

The switch to term-equality might break some code that uses the current
default implementation.
A switch in the GraphMemFactory in Jena 5.x to make it backwards compatible
seems to be a good option.


We don't get many points when we can make such changes.
Setting the default is major version territory.


In this case, the general Jena codebase should remain compatible with the
literal value equality semantics.


It is hard for Fuseki users to notice. The transactional in-memory 
dataset is already term-semantics.


It's easier for API users to configure details as necessary or to smooth 
their migration.



As far as I know, org.apache.jena.graph.Capabilities#handlesLiteralTyping
should be used to control the behaviour here. 


Capabilities can be unreliable - applications don't seem to check! TDB1, 
TDB2 canonicalization some known datatypes on input which isn't the 
exact definition of term or value semantics. There will be one triple 
for ":s :p 1" and ":s :p +1 which is term like.


I think this is a simpler-is-better case. Applications makes the choice 
by the ModelFactory call and have a single API setting for the default.


What we do know is that all other storage graphs are term-semantics and 
the issue doesn't come up very often. And when it does, it has been a 
matter of explaining the situation.


(FYI: There is one impl GraphPlain that undoes value-semantics.)


My guess is, we might find
some places where it is not considered yet, because GraphMem has been the
default for so many years.


Yes - in tests for example.

Where there are tests, move them to a test class which is specific to 
GraphMem (if not already).


Then one test for "current settings".


If there is not enough time to evaluate GraphMem2Fast over the summer, it
may be wise to start with GraphMem2Legacy as the default in Jena 5.x.
If the community sees a real advantage in GraphMem2Fast, we could make it
the new default in a later version.


As long as the Jena 5.x contract is term-semantics, we can adjust best 
implementation in minor versions.


Andy



    Arne

Am Fr., 23. Juni 2023 um 13:08 Uhr schrieb Andy Seaborne :




On 22/06/2023 21:08, Arne Bernhardt wrote:

Do you think it would be possible to integrate
https://github.com/apache/jena/issues/1912 in Jena  4.9.0 ?
So there would be enough time and feedback to see if it can replace
GraphMem as default in Jena 5.0.0?

   Arne


Yes.

A switch to term-semantics by default in graph/model is a 5.x thing but
the code can be available. Feedback would be good but we can't rely on
that; everyone is time-short.

So would this be extra calls in ModelFactory?
Possibly with a single switch so that the default can be made into one
of the new term graphs? These Models and Graphs get created implicit as
well as by application calls to ModelFactory.

  Andy

Let's rename org.apache.jena.graph.Factory to
org.apache.jena.graph.GraphMemFactory at 5.0.0
It's annoying.

https://github.com/apache/jena/issues/1919
and PR 1920 to start the process.





Re: Bumps in the road(map)

2023-06-23 Thread Andy Seaborne




On 23/04/2023 15:16, Andy Seaborne wrote:

2/ javax.* -> jakarata.*

This is the difference between Jetty 10 and Jetty11. Jetty 12.0 is 
currently in beta.


But.

Spring Boot 2 is based on javax (Jetty10) and Spring Boot 3 uses jakarta 
(Jetty11 configured).


Spring Boot 2 to Spring Boot 3 includes other upgrades as well. [1]

A way to deal with this is switch to jakarta.* at Jena 5.


The change javax.servlet to jakarta.servlet (with Jetty11) is quite 
straightforward.


Andy


Re: Towards Jena 4.9.0

2023-06-23 Thread Andy Seaborne




On 22/06/2023 21:08, Arne Bernhardt wrote:

Do you think it would be possible to integrate
https://github.com/apache/jena/issues/1912 in Jena  4.9.0 ?
So there would be enough time and feedback to see if it can replace
GraphMem as default in Jena 5.0.0?

  Arne


Yes.

A switch to term-semantics by default in graph/model is a 5.x thing but 
the code can be available. Feedback would be good but we can't rely on 
that; everyone is time-short.


So would this be extra calls in ModelFactory?
Possibly with a single switch so that the default can be made into one 
of the new term graphs? These Models and Graphs get created implicit as 
well as by application calls to ModelFactory.


Andy

Let's rename org.apache.jena.graph.Factory to 
org.apache.jena.graph.GraphMemFactory at 5.0.0

It's annoying.

https://github.com/apache/jena/issues/1919
and PR 1920 to start the process.


Towards Jena 4.9.0

2023-06-22 Thread Andy Seaborne

Jena 2.8.0 was 23/04/2023.
  And Java 21 LTS is September 19th.
  https://openjdk.org/projects/jdk/21/

So it's a early for 4.9.0 but it fits in better to keep away from summer 
and vacations.


At the moment:
  https://s.apache.org/jena-4.9.0-issues

jena-4.9.0 is 18 issues closed in 2 months and 36 PRs

Andy

---

Specific SPARQL 1.2 parser, tracking the RDF-star working group.
  All features are also available in the default SPARQL parser.

Arne Berdhardt has provided a performance analysis and
  improvements for the default in-memory graphs together
  with a benchmarking framework
  https://github.com/apache/jena/pull/1279
https://github.com/apache/jena/pull/1279

FusekiModules:
Issue: https://github.com/apache/jena/issues/1897

There is a change in that the interface for automatically loading 
modules from the classpath has changed to FusekiAutoModule, The 
interface FusekiModule is now the configuration lifecycle only. This is 
to allow for programmatically set up a Fuskei server with Fuseki 
modules, including custom one from the calling application.


Simon Bin (@SimonBin)
A fix for incorrect integer cast in scripting.NV
https://github.com/apache/jena/pull/1851

Alexander Ilin-Tomich (@ailintom)
Fix for SPARQL_Update verification and /HTTP PATCH

Issue: https://github.com/apache/jena/issues/1873
Command line parser riot
Warn on arguments that allow quads but output triples
  And error/warn if quads encountered
Add argument --merge to project quads to triples.

Ryan Shaw (@rybesh)
Script fix for additional classpath elements
https://github.com/apache/jena/pull/1877

SERVICE on/off control
https://github.com/apache/jena/pull/1906

Provide the ability to switch off all SERVICE processing completely.
Use
  arq:httpServiceAllowed
  http://jena.apache.org/ARQ#httpServiceAllowed=false
to disable.

e.g.
  fuseki-server --set arq:httpServiceAllowed=false 

Additional restrictions and control for SPARQL script functions
https://github.com/apache/jena/pull/1908

There is a new Jena context setting
  http://jena.apache.org/ARQ#scriptAllowList
which is on the command line:
  arq:scriptAllowList
and java constant
  ARQ.symCustomFunctionScriptAllowList

Its value is a comma separated list of function names.
  "function1,function2"
Only the functions in this can be called from SPARQL.

As in Jena 4.8.0, the Java system property "jena:scripting" must also be 
set to "true" to enable script functions.

  Website (when published):
   https://jena.apache.org/documentation/query/javascript-functions


Re: Why DatasetGraphInMemory?

2023-06-17 Thread Andy Seaborne




On 12/06/2023 21:36, Arne Bernhardt wrote:

Hi Andy

you mentioned RoaringBitmaps. I took the time to experiment with them.
They are really amazing. The performance of #add, #remove and #contains is
comparable to Java HashSet. RoaringBitmaps are much faster at iterating
over values and they perform bit operations even between two quite large
bitmaps like a charm. RoaringBitmaps also need less memory than a
JavaHasSet. (even less than an optimized integer hash set based on the
concepts in HashCommon)
A first graph implementation was easy to create. (albeit with a little help
from ChatGPT, as I had no idea how to use RoaringBitmaps yet).
One only needs an indexed set of all triples and three maps indexed by
subject, predicate and object and bitmaps as values.
Each bitmap contains all indices of the triples with the corresponding node.
To find SPO --> use the set with all triples.
To find S__, _P_, or __O --> lookup the bitmap in the corresponding map and
iterate over all indices mapping to triples via the indexed set.
To find SP_, S_O, or _PO --> lookup the two bitmaps for both given nodes,
perform an "and" operation with both bitmaps and again iterate over the
resulting indices mapping to triples via the indexed set.
Especially the query of _PO is incredibly fast compared to GraphMem or
similarly structured graphs.
Just for fun, I replaced the bitmaps with two sets of integers and
simulated the "and" operation by iterating over the smallest set and
checking the entries in the larger set using #contains --> it is 10-100
times slower than the "and" operation of RoaringBitmaps.
Now I really understand the hype around RoaringBitmaps. It seems absolutely
justified to me.
Smaller graphs with RoaringBitmaps need about twice as much memory for the
indexing structures (triples excluded) as GraphMem.
(The additional memory requirement is not only due to the bitmaps, but also
to the additional indexed set of triples).
For larger graphs (> 500k and above), this gap begins to close. At 1M
triples, the variant with roaring bitmaps wins the advantage with 88MB
compared to 106MB with GraphMem.
After loading all the triples from bsbm-25m.nt.gz and two JVM warmup
iterations, it only took about 18 seconds to add them to the new graph, and
this graph only required an additional 1941 MB of memory.

I'm not sure how RoaringBitmaps handles permanent updates. I have tried
many #add and #remove calls on larger graphs and they seem to work well.
But there are two methods that caught my attention:
*
https://javadoc.io/doc/org.roaringbitmap/RoaringBitmap/latest/org/roaringbitmap/RoaringBitmap.html#runOptimize()
*
https://javadoc.io/doc/org.roaringbitmap/RoaringBitmap/latest/org/roaringbitmap/RoaringBitmap.html#trim()
I have no idea when it would be a good time to use them.
Removing and adding triples from a graph of size x in y iterations and
measuring the impact on memory and performance could be one way to find
potential problems.
Do you have a scenario in mind that I could use to test if I ever need one
of these methods?


Just from reading the javadoc - #runOptimize() might be useful for a 
load-and-readonly graph - do a lot of loading work and switch to the 
more efficient. It depends no how much space it saves. My instinct is 
that the saving for the overall graph may not be that great because the 
RDF terms take up a log of the space at scale so savings on the the 
bitmaps might, overall, not be significant.




Arne

Andy Seaborne  schrieb am Mo., 22. Mai 2023, 16:52:




On 20/05/2023 17:18, Arne Bernhardt wrote:

Hi Andy,
thank you, that was very helpful to get the whole picture.

Some time ago, I told you that at my workplace we implemented an

in-memory

SPARQL-Server based on a Delta
<

https://jena.apache.org/documentation/javadoc/jena/org.apache.jena.core/org/apache/jena/graph/compose/Delta.html


.
We started a few years ago, before RDF-patch
<https://jena.apache.org/documentation/rdf-patch/>, based on the

"difference

model"
<https://lists.w3.org/Archives/Public/www-rdf-interest/2001Mar/0216.html
,
that has become part of the CGMES standard.
For our server, we strictly follow the CQRS with event-sourcing
<https://learn.microsoft.com/en-us/azure/architecture/patterns/cqrs>
pattern. All transactions are recorded as an event with a list of triples
added and a list of triples removed.
The events are stored in an RDBMS (Oracle or PostgreSQL). For query
execution we need the relevant data to fit into memory but all data and
versions are also persisted.
To be able to store and load graphs very fast, we use RDF Thrift with LZ4
compression and store them in blobs.
All queries are executed on projected datasets for the requested version
(any previous version) of the data and the requested named graphs.
Thanks to the versioning, we fully support MR+SW. We even support

multiple

writers, with a git-like branching and merging approach and o

Re: Fuseki Modules

2023-06-12 Thread Andy Seaborne




On 03/01/2023 13:43, Andy Seaborne wrote:



On 03/01/2023 11:40, LB wrote:

Hi all and late happy new year!

Nice work with the modules Andy.

Now, a probably a silly question and maybe I missed something already 
mentioned in some other mail ...


Documentation:

"Modules are invoked during the process of building a Fuseki Main server."

I tried to add a Fuseki Module, but for the Fuseki with UI Standalone 
setup. It looks like my module does only work for setups based on 
Fuseki Main, is this correct? When using Fuseki Standalone, I can see 
from logs that FusekiModule::start is called,


How? Because FusekiModule is in jena-fuseki-main.

(Your fuseki module may have too much Fuseki in it - you have to exclude 
all of Fuseki from a module jar if it is shaded.)


Having to produce a jar file without the Fuseki in it, despite compiling 
code for Fuseki interfaces, is a bit of a burden when using FusekiMain 
in Java code.


FusekiModule conflates configuration during the build lifecycle and 
loading code using ServiceLoader.


The orignal idea was for drop-in extensions but the configuration during 
the build lifecycle is useful in it's own right.


https://github.com/apache/jena/pull/1898

proposes some changes:

* FusekiModule becomes just the interface to server building.

* A new FusekiAutoModule combines FusekiModule with SubsystemLifecycle 
(the loading support)


* FusekiModule class - an immutable collection of FusekiModule (and 
FusekiAutoModule)


* FusekiServer.Builder can be given a FusekiModule object.
  If none is given, then one based on all the FusekiAutoModule is
  created. Also, for the system wide FusekiAutoModule, a fresh
  object is created for each server build so the object can
  hold per-build state.


The downside is that the ServiceLoader file in /META/services/
  org.apache.jena.fuseki.main.sys.FusekiModule
changes name to
  org.apache.jena.fuseki.main.sys.FusekiAutoModule

FusekiModule is still labelled "experimental".

The suggestion is getting the naming right long-term is more important 
than complete compatibility.


If anyone is using Fuseki module, please add your experiences so we can 
confidently remove the "experimental" tag.


Andy


Re: Why DatasetGraphInMemory?

2023-05-22 Thread Andy Seaborne
 have the previous versions until 
compaction happens. Each index is immutable after update and delta tree 
gets created (all the way back to the tree root). The tree roots are 
still in the DB until it is cleared up by compaction.


Sounds like you have the style, but applied to the graph, and can use 
the GC for clearing up.


---

Another is to use bitmap indexes. https://roaringbitmap.org/. (I don't 
what the time/space tradeoff is for RDF usage.)


Andy



  Arne

Am Sa., 20. Mai 2023 um 15:19 Uhr schrieb Andy Seaborne :


Hi Arne,

On 19/05/2023 21:21, Arne Bernhardt wrote:

Hi,
in a recent  response
<https://github.com/apache/jena/issues/1867#issuecomment-1546931793> to

an

issue it was said that   "Fuseki - uses DatasetGraphInMemory mostly"  .
For my  PR <https://github.com/apache/jena/pull/1865>, I added a JMH
benchmark suite to the project. So it was easy for me to compare the
performance of GraphMem with
"DatasetGraphFactory.createTxnMem().getDefaultGraph()".
DatasetGraphInMemory is much slower in every discipline tested (#add,
#delete, #contains, #find, #stream).
Maybe my approach is too naive?
I understand very well that the underlying Dexx Collections Framework,

with

its immutable persistent data structures, makes threading and transaction
handling easy


DatasetGraphInMemory (TIM = Transactions In Memory) has one big advantage.

It supports multiple-readers and a single-writer (MR+SW) at the same
time - truly concurrent. So does TDB2 (TDB1 is sort of hybrid).

MR+SW has a cost which is a copy-on-write overhead, a reader-centric
design choice allowing the readers to run latch-free.

You can't directly use a regular hash map with concurrent updates. (And
no, ConcurrentHashMap does not solve all problems, even for a single
datastructure. A dataset needs to coordinate changes to multiple
datastructure into a single transactional unit.

GraphMem can not do MR+SW - for all storage datasets/graphs that do not
have built-in for MR+SW, the best that can be done is MRSW -
multiple-readers or a single-writer.

For MRSW, when a writer starts, the system has to hold up subsequent
readers, let existing ones finish, then let the writer run, then release
any readers held up. (variations possible - whether readers or writers
get priority).

This is bad in a general concurrent environment. e.g. Fuseki.

One writer can "accidently" lock-out the dataset.

Maybe the application isn't doing updates, in which case, a memory
dataset focuses on read throughput is better, especially with better
triple density in memory.

Maybe the application is single threaded or can control threads itself
(non-Fuseki).


and that there are no issues with consuming iterators or
streams even after a read transaction has closed.


Continuing to use an iterator after the end of a transaction should not
be allowed.


Is it currently supported for consumers to use iterators and streams

after

a transaction has been closed?


Consumers that want this must copy the iterator - it's an explicit opt-in.

Does this happen with Dexx? It may do, because Dexx relies on the
garbage collector so some things just happen.


If so, I don't currently see an easy way to
replace DatasetGraphInMemory with a faster implementation. (although
transaction-aware iterators that copy the remaining elements into lists
could be an option).


copy-iterators are going to be expensive in RAM - a denial of service
issue - and speed (lesser issue, possibly).


Are there other reasons why DatasetGraphInMemory is the preferred dataset
implementation for Fuseki?


MR+SW in an environment where there is no other information about
requirements is the safe choice.

If an app wants to trade the issues of MRSW for better performance, it
is a choice it needs to make. One case for Fuseki is publishing
relatively static data - e.g. reference data, changes from a known, well
behaved, application

Both a general purpose TIM and a higher density, faster dataset have
their places.

  Andy



Cheers,
Arne







Re: Why DatasetGraphInMemory?

2023-05-20 Thread Andy Seaborne

Hi Arne,

On 19/05/2023 21:21, Arne Bernhardt wrote:

Hi,
in a recent  response
 to an
issue it was said that   "Fuseki - uses DatasetGraphInMemory mostly"  .
For my  PR , I added a JMH
benchmark suite to the project. So it was easy for me to compare the
performance of GraphMem with
"DatasetGraphFactory.createTxnMem().getDefaultGraph()".
DatasetGraphInMemory is much slower in every discipline tested (#add,
#delete, #contains, #find, #stream).
Maybe my approach is too naive?
I understand very well that the underlying Dexx Collections Framework, with
its immutable persistent data structures, makes threading and transaction
handling easy


DatasetGraphInMemory (TIM = Transactions In Memory) has one big advantage.

It supports multiple-readers and a single-writer (MR+SW) at the same 
time - truly concurrent. So does TDB2 (TDB1 is sort of hybrid).


MR+SW has a cost which is a copy-on-write overhead, a reader-centric 
design choice allowing the readers to run latch-free.


You can't directly use a regular hash map with concurrent updates. (And 
no, ConcurrentHashMap does not solve all problems, even for a single 
datastructure. A dataset needs to coordinate changes to multiple 
datastructure into a single transactional unit.


GraphMem can not do MR+SW - for all storage datasets/graphs that do not 
have built-in for MR+SW, the best that can be done is MRSW - 
multiple-readers or a single-writer.


For MRSW, when a writer starts, the system has to hold up subsequent 
readers, let existing ones finish, then let the writer run, then release 
any readers held up. (variations possible - whether readers or writers 
get priority).


This is bad in a general concurrent environment. e.g. Fuseki.

One writer can "accidently" lock-out the dataset.

Maybe the application isn't doing updates, in which case, a memory 
dataset focuses on read throughput is better, especially with better 
triple density in memory.


Maybe the application is single threaded or can control threads itself 
(non-Fuseki).



and that there are no issues with consuming iterators or
streams even after a read transaction has closed.


Continuing to use an iterator after the end of a transaction should not 
be allowed.



Is it currently supported for consumers to use iterators and streams after
a transaction has been closed?


Consumers that want this must copy the iterator - it's an explicit opt-in.

Does this happen with Dexx? It may do, because Dexx relies on the 
garbage collector so some things just happen.



If so, I don't currently see an easy way to
replace DatasetGraphInMemory with a faster implementation. (although
transaction-aware iterators that copy the remaining elements into lists
could be an option).


copy-iterators are going to be expensive in RAM - a denial of service 
issue - and speed (lesser issue, possibly).



Are there other reasons why DatasetGraphInMemory is the preferred dataset
implementation for Fuseki?


MR+SW in an environment where there is no other information about 
requirements is the safe choice.


If an app wants to trade the issues of MRSW for better performance, it 
is a choice it needs to make. One case for Fuseki is publishing 
relatively static data - e.g. reference data, changes from a known, well 
behaved, application


Both a general purpose TIM and a higher density, faster dataset have 
their places.


Andy



Cheers,
Arne



Re: Bumps in the road(map)

2023-05-14 Thread Andy Seaborne




On 23/04/2023 15:16, Andy Seaborne wrote:


4/ Others?
Drop the war file?


https://github.com/apache/jena/issues/1867 reminds me ...

Switch to term equality on all graphs.
This affects GraphMem (keep it around but don't use it by default).

The value-based indexing in only one place, can be confusing.

Andy


Bumps in the road(map)

2023-04-23 Thread Andy Seaborne

There are two things that are significant changes,


1/ Java21 is due to be released September 2023 and be a LTS release.

Given our policy of "2 versions of Java" interpreted as "2 LTS 
releases", we can move to requiring Java17.


(Java17, with compiler set to "release=11", outputting Java11 byte code, 
is already used to build Jena. This is because javadoc generation with 
native Java11 has been broken in several ways.)


Java17 has multiline strings for SPARQL queries!


2/ javax.* -> jakarata.*

This is the difference between Jetty 10 and Jetty11. Jetty 12.0 is 
currently in beta.


But.

Spring Boot 2 is based on javax (Jetty10) and Spring Boot 3 uses jakarta 
(Jetty11 configured).


Spring Boot 2 to Spring Boot 3 includes other upgrades as well. [1]

A way to deal with this is switch to jakarta.* at Jena 5.


This gives us:

April   - Jena 4.8.0
July(-ish)  - Jena 4.9.0

October 5.0.0: Java17, Jetty11, maybe Jetty12.
and leave a Jena4 branch.

So if we are doing Jena 5, what else should change at the major version 
bump?



3/ Drop a separate JSON-LD 1.0 subsystem.

This also pulls in org.apache.http (although Jena controls the versions 
because we've had to in the past to get maven to make the right choice 
in resolving alternatives).


The last commit to jsonld-java/main was Dec 13, 2021
The front page says : "JSONLD-Java is looking for a maintainer"

JSON-LD 1.1 was published 16 July 2020

4/ Others?
Drop the war file?

Andy


[1] 
https://github.com/spring-projects/spring-boot/wiki/Spring-Boot-3.0-Migration-Guide


[RESULT] [VOTE] Apache Jena 4.8.0 RC2

2023-04-23 Thread Andy Seaborne

The VOTE passes with 3 +1 from Bruno, Aaron and Andy

I'll start the next steps.

Andy

On 20/04/2023 10:27, Andy Seaborne wrote:

Hi,

Here is a vote on the release of Apache Jena 4.8.0.
This is the second release candidate.

The deadline is

     Sunday, 23rd April 2023 at 12:00 UTC

Please vote to approve this release:

     [ ] +1 Approve the release
     [ ]  0 Don't care
     [ ] -1 Don't release, because ...

 Items in this release

== RC 2

a/ Fix the initialization issue found in RC1
https://github.com/apache/jena/pull/1847

b/ GH-1749: Replacing webpack chunks by Vite rollup

c/ fix and test for JENA-2352

== RC 1

https://s.apache.org/jena-4.8.0-issues

* The RDF/XML parser has been converted to use the
   Jena IRI abstraction IRIx.
   https://github.com/apache/jena/issues/1773

This is the first part of a move to convert the RDF/XML parser to be 
consistent with the rest of Jena parsing


1. unified IRI treatment of error handling and reporting throughout Jena
2. improve maintainability
3. allow for alternative providers of IRI functionality

* Add CHANGES.txt
https://github.com/apache/jena/blob/main/CHANGES.txt
   It has been backfilled with announcement message from 4.0.0 onwards.
   It will be updated after the release - it has a link to [ANN]

* Search facility on the Jena website

@lucasvr (Lucas C. Villa Real) provided an analysis and improvement to 
bulk loading operations.

   https://github.com/apache/jena/issues/1803
   https://github.com/apache/jena/pull/1819

@wjl110 - Shiro upgrade PR#1728
   https://github.com/apache/jena/pull/1728

Lucene upgrade from 9.4.2 to 9.5.0
   https://github.com/apache/jena/pull/1740
   https://lists.apache.org/thread/696xgpyg2441kzdowmp1b40tshctw25c

@dplagge (Daniel Plagge) - Delta graph fix
https://github.com/apache/jena/issue/1751

SimonBin: Fix for sharing link in Fuseki and YASGE
   https://github.com/apache/jena/issues/1745

Improved performance of "GRAPH ?g {}" (all graph names)
Prefix scan -- GRAPH ?G
   https://github.com/apache/jena/issues/1639
   https://github.com/apache/jena/pull/1655

@nichtich (Jakob Voß) jena-site improvements:
   https://github.com/apache/jena-site/pull/151

@sverholen JENA-2350 Pass JsonLdOptions to titanium for json-ld 1.1

 Release Vote

Everyone, not just committers, is invited to test and vote.
Please download and test the proposed release.

Staging repository:
   https://repository.apache.org/content/repositories/orgapachejena-1058

Proposed dist/ area:
   https://dist.apache.org/repos/dist/dev/jena/

Keys:
   https://svn.apache.org/repos/asf/jena/dist/KEYS

Git commit (browser URL):
   https://github.com/apache/jena/commit/198e6950c7

Git Commit Hash:
   198e6950c7652ffe68c9171bc5ed92c69210c60a

Git Commit Tag:
   jena-4.8.0

This vote will be open until at least

   Sunday, 23rd April 2023 at 12:00 UTC

If you expect to check the release but the time limit does not work
for you, please email within the schedule above.

     Thanks,
     Andy

Checking:

+ are the GPG signatures fine?
+ are the checksums correct?
+ is there a source archive?
+ can the source archive be built?
   (NB This requires a "mvn install" first time)
+ is there a correct LICENSE and NOTICE file in each artifact
   (both source and binary artifacts)?
+ does the NOTICE file contain all necessary attributions?
+ have any licenses of dependencies changed due to upgrades?
    if so have LICENSE and NOTICE been upgraded appropriately?
+ does the tag/commit in the SCM contain reproducible sources?


Re: [VOTE] Apache Jena 4.8.0 RC2

2023-04-20 Thread Andy Seaborne

+1

On 20/04/2023 10:27, Andy Seaborne wrote:

Hi,

Here is a vote on the release of Apache Jena 4.8.0.
This is the second release candidate.

The deadline is

     Sunday, 23rd April 2023 at 12:00 UTC

Please vote to approve this release:

     [ ] +1 Approve the release
     [ ]  0 Don't care
     [ ] -1 Don't release, because ...


[VOTE] Apache Jena 4.8.0 RC2

2023-04-20 Thread Andy Seaborne

Hi,

Here is a vote on the release of Apache Jena 4.8.0.
This is the second release candidate.

The deadline is

Sunday, 23rd April 2023 at 12:00 UTC

Please vote to approve this release:

[ ] +1 Approve the release
[ ]  0 Don't care
[ ] -1 Don't release, because ...

 Items in this release

== RC 2

a/ Fix the initialization issue found in RC1
https://github.com/apache/jena/pull/1847

b/ GH-1749: Replacing webpack chunks by Vite rollup

c/ fix and test for JENA-2352

== RC 1

https://s.apache.org/jena-4.8.0-issues

* The RDF/XML parser has been converted to use the
  Jena IRI abstraction IRIx.
  https://github.com/apache/jena/issues/1773

This is the first part of a move to convert the RDF/XML parser to be 
consistent with the rest of Jena parsing


1. unified IRI treatment of error handling and reporting throughout Jena
2. improve maintainability
3. allow for alternative providers of IRI functionality

* Add CHANGES.txt
https://github.com/apache/jena/blob/main/CHANGES.txt
  It has been backfilled with announcement message from 4.0.0 onwards.
  It will be updated after the release - it has a link to [ANN]

* Search facility on the Jena website

@lucasvr (Lucas C. Villa Real) provided an analysis and improvement to 
bulk loading operations.

  https://github.com/apache/jena/issues/1803
  https://github.com/apache/jena/pull/1819

@wjl110 - Shiro upgrade PR#1728
  https://github.com/apache/jena/pull/1728

Lucene upgrade from 9.4.2 to 9.5.0
  https://github.com/apache/jena/pull/1740
  https://lists.apache.org/thread/696xgpyg2441kzdowmp1b40tshctw25c

@dplagge (Daniel Plagge) - Delta graph fix
https://github.com/apache/jena/issue/1751

SimonBin: Fix for sharing link in Fuseki and YASGE
  https://github.com/apache/jena/issues/1745

Improved performance of "GRAPH ?g {}" (all graph names)
Prefix scan -- GRAPH ?G
  https://github.com/apache/jena/issues/1639
  https://github.com/apache/jena/pull/1655

@nichtich (Jakob Voß) jena-site improvements:
  https://github.com/apache/jena-site/pull/151

@sverholen JENA-2350 Pass JsonLdOptions to titanium for json-ld 1.1

 Release Vote

Everyone, not just committers, is invited to test and vote.
Please download and test the proposed release.

Staging repository:
  https://repository.apache.org/content/repositories/orgapachejena-1058

Proposed dist/ area:
  https://dist.apache.org/repos/dist/dev/jena/

Keys:
  https://svn.apache.org/repos/asf/jena/dist/KEYS

Git commit (browser URL):
  https://github.com/apache/jena/commit/198e6950c7

Git Commit Hash:
  198e6950c7652ffe68c9171bc5ed92c69210c60a

Git Commit Tag:
  jena-4.8.0

This vote will be open until at least

  Sunday, 23rd April 2023 at 12:00 UTC

If you expect to check the release but the time limit does not work
for you, please email within the schedule above.

Thanks,
Andy

Checking:

+ are the GPG signatures fine?
+ are the checksums correct?
+ is there a source archive?
+ can the source archive be built?
  (NB This requires a "mvn install" first time)
+ is there a correct LICENSE and NOTICE file in each artifact
  (both source and binary artifacts)?
+ does the NOTICE file contain all necessary attributions?
+ have any licenses of dependencies changed due to upgrades?
   if so have LICENSE and NOTICE been upgraded appropriately?
+ does the tag/commit in the SCM contain reproducible sources?


  1   2   3   4   5   6   7   8   9   10   >