Logging update.

Using JUL (java.util.logging) has a problem - it need initialization code or a system property to be set. What it does not do is read "logging.properties" from the classpath by default, like log4j (1 or 2) does.

At the moment, I have converted the majority of the modules to use log4j2 in testing.

One gotcha - if the logging configuration isn't found the effect is to simply skip the tests without any warning or error. This is confusing!

Current plan:

* Convert to log4j2 everywhere.

Every module is setup for testing logging using log4j2 - each with it's own log4j2 configuration (they are nearly all the same).

* The configuration is log4j.xml or log4j.properties.

I like the YAML format but that needs additional dependencies from Jackson: databind and yaml-dataformat. dtabind has been the cause of the CVEs. databind - no.

* Other clearups

Since logging changes is going into most of the POMs, now is good time to do other clearups. Anything on anyone's mind?

One such is removing the last of the dependencies from /pom.xml (while keeping dependency management). The org.slf4j dependency goes into jena-base (JENA-1860).

The only advantage of parent dependencies is that it works for <scope>test. Getting dependencies from parents is fixed, and while <scope>test works if it is parent-declared dependency, it can't be excluded.

junit and logging for tests goes into each module POM or in the subparents (elephas, jdbc, jena-db, fuseki2).

Is there a better way to do this? The one I can think of is a new parent which is optional for modules with a few declarations (1 for junit, 2 for logging) and that is a new maven artifact for 3 declarations which seems rather heavy.

    Andy

On 11/02/2020 13:29, Andy Seaborne wrote:
Summary:

* Replacing log4j1 in logging for Fuseki will affect some users.



log4j1 is well past its end of life. There is a CVE on log4j1 - no impact directly on Jena but a nudge to advance on JENA-1005

The CVE is for using socket server for receiving log events from other
machines. Jena does not use it, and it would be the user who configured any use of it. As everything in log4j1 is in the single jar, shipping that jar ships the socket server and it shows up in static security scans.

So - what does it take to get off log4j1 and what to switch to?

Uses of logging output:

* Fuseki has logging setup code with internal defaults
* jena-cmds uses logging with internal defaults
* Testing - Jena needs some logging provider to run during tests. Also, some testing wants to dynamically adjust logging (e.g. suppress warnings when expected)
* jena-jdbc ??

PR#690 makes the testing independent of logging provider by using the jena-base logging setup code to make dynamically adjusting logging independent of the logging provider. It does not change the use of log4j1.

The tests use whatever is setup in /pom.xml as a <scope>test dependency. That's not ideal because it makes it hard to have different logging choices for different usages in Jena but it does remove duplicating a <dependency> in each module. (Is there a better way?).

The crunch is Fuseki because a change of logging is change of existing logging configurations.  It also makes a choice for tests the same as Fuseki because you can't undepend on something from the parent as far as I can see, and if Fuseki adds logging to its <scope>runtime, there are two logging providers and slf4j issues warnings.

The simple solution is use log4j2 in the test suites. The alternative is put a <dependency> in each module, removing it from /pom, the test can be JUL (and cmds?), and Fuseki server logging log4j2.

log4j2 has a properties-like configuration format (as well as
XML, YAML, and JSON).  For installations that use the default logging setup (formatted, to stdout), there will be be no change.

My guess is that only few installations do change the default setup but I know of installations that do. That includes one where the logs are sent to Linux, syslog and then collected across a cluster. JUL (java.util.logging) doesn't have many connectors; log4j does.

While the command line tools do look for a local log4j.properties, I don't know of any usage.  The limited logging (e.g. tdb loaders) is used more for a convenient way to control formatting and switch on/off.  The parsers do log which helps in library and command line use.

FYI: there is a slf4j 1.8 that switches to using ServiceLoader. It is at 1.8.0-beta4. There is also a slf4j 2.0.0, currently 2.0.0-alpha1 that adds a fluent API for log messages.  From what I read, existing normal logging code is binary compatible.

So Fuseki has to change, some (not all) users affected.

Fuseki (servers, not WAR?) choice is log4j2 or other full-featured system.

Ideally, jena-cmds is JUL just from the footprint reduction but can be log4j2.

     Andy

Reply via email to