Logging update.
Using JUL (java.util.logging) has a problem - it need initialization
code or a system property to be set. What it does not do is read
"logging.properties" from the classpath by default, like log4j (1 or 2)
does.
At the moment, I have converted the majority of the modules to use
log4j2 in testing.
One gotcha - if the logging configuration isn't found the effect is to
simply skip the tests without any warning or error. This is confusing!
Current plan:
* Convert to log4j2 everywhere.
Every module is setup for testing logging using log4j2 - each with it's
own log4j2 configuration (they are nearly all the same).
* The configuration is log4j.xml or log4j.properties.
I like the YAML format but that needs additional dependencies from
Jackson: databind and yaml-dataformat. dtabind has been the cause of the
CVEs. databind - no.
* Other clearups
Since logging changes is going into most of the POMs, now is good time
to do other clearups. Anything on anyone's mind?
One such is removing the last of the dependencies from /pom.xml (while
keeping dependency management). The org.slf4j dependency goes into
jena-base (JENA-1860).
The only advantage of parent dependencies is that it works for
<scope>test. Getting dependencies from parents is fixed, and while
<scope>test works if it is parent-declared dependency, it can't be excluded.
junit and logging for tests goes into each module POM or in the
subparents (elephas, jdbc, jena-db, fuseki2).
Is there a better way to do this? The one I can think of is a new
parent which is optional for modules with a few declarations (1 for
junit, 2 for logging) and that is a new maven artifact for 3
declarations which seems rather heavy.
Andy
On 11/02/2020 13:29, Andy Seaborne wrote:
Summary:
* Replacing log4j1 in logging for Fuseki will affect some users.
log4j1 is well past its end of life. There is a CVE on log4j1 - no
impact directly on Jena but a nudge to advance on JENA-1005
The CVE is for using socket server for receiving log events from other
machines. Jena does not use it, and it would be the user who configured
any use of it. As everything in log4j1 is in the single jar, shipping
that jar ships the socket server and it shows up in static security scans.
So - what does it take to get off log4j1 and what to switch to?
Uses of logging output:
* Fuseki has logging setup code with internal defaults
* jena-cmds uses logging with internal defaults
* Testing - Jena needs some logging provider to run during tests. Also,
some testing wants to dynamically adjust logging (e.g. suppress warnings
when expected)
* jena-jdbc ??
PR#690 makes the testing independent of logging provider by using the
jena-base logging setup code to make dynamically adjusting logging
independent of the logging provider. It does not change the use of log4j1.
The tests use whatever is setup in /pom.xml as a <scope>test dependency.
That's not ideal because it makes it hard to have different logging
choices for different usages in Jena but it does remove duplicating a
<dependency> in each module. (Is there a better way?).
The crunch is Fuseki because a change of logging is change of existing
logging configurations. It also makes a choice for tests the same as
Fuseki because you can't undepend on something from the parent as far as
I can see, and if Fuseki adds logging to its <scope>runtime, there are
two logging providers and slf4j issues warnings.
The simple solution is use log4j2 in the test suites. The alternative is
put a <dependency> in each module, removing it from /pom, the test can
be JUL (and cmds?), and Fuseki server logging log4j2.
log4j2 has a properties-like configuration format (as well as
XML, YAML, and JSON). For installations that use the default logging
setup (formatted, to stdout), there will be be no change.
My guess is that only few installations do change the default setup but
I know of installations that do. That includes one where the logs are
sent to Linux, syslog and then collected across a cluster. JUL
(java.util.logging) doesn't have many connectors; log4j does.
While the command line tools do look for a local log4j.properties, I
don't know of any usage. The limited logging (e.g. tdb loaders) is used
more for a convenient way to control formatting and switch on/off. The
parsers do log which helps in library and command line use.
FYI: there is a slf4j 1.8 that switches to using ServiceLoader. It is at
1.8.0-beta4. There is also a slf4j 2.0.0, currently 2.0.0-alpha1 that
adds a fluent API for log messages. From what I read, existing normal
logging code is binary compatible.
So Fuseki has to change, some (not all) users affected.
Fuseki (servers, not WAR?) choice is log4j2 or other full-featured system.
Ideally, jena-cmds is JUL just from the footprint reduction but can be
log4j2.
Andy