Hello guys,

This thread has been going on for some time, and a good amount of people have spoken their opinions on it, and nobody has been explicitly against the upgrade.

Some concerns have been raised during the thread:

- Testing: we have already tested this change against multiple scenarios that we have in hand, such as VMware, KVM, Veeam + VMware, and all related ACS feature set that is commonly used by all (deploy VM, update VM, migrate, create volumes, and so on), and we welcome anyone who is interested in further testing. Unfortunately, only a handful of people have control over the CI that is currently used by the community; therefore, we are always waiting for someone to run it. Furthermore, it is a black box, we also need people with access to provide us more details when a failure happens.

- Conflicts: our intention is to merge this ASAP so that the contributors have plenty of time to fix their PRs, which honestly should not be that hard, as the problem will only happen with the "logger" variable. We also made a simple script available to help people to address conflicts regarding the logger variable.

- Upgrading: The PR already upgrades the current default log4j configurations, but users will have to be warned that they should port their customizations to the log4j2 format. We can add a notice for that in the release notes of the release.

All that being said, I will create a vote thread with the discussion summarized and with the proposals.

Best regards,
João Jandre (JoaoJandre)

On 13/05/2023 13:22, Rohit Yadav wrote:
All,

The latest news I heard from the reload4j project is that they are now 
sponsored by the Sovereign Tech Fund (STF) among other backers [1] so I don't 
see any support, maintenance, longevity risk for the CloudStack project to 
continue using reload4j:
https://twitter.com/qos_ch/status/1657069740341198881


Considering all the facts, I don't think there is any urgency or technical 
requirement for us to switch our current logging library.


However, I also don't object to the proposed effort to upgrade CloudStack to 
log4j 2.x as long as;

   *   all risks are have been considered and all efforts are made to reduce 
the impact (for example, help merge as many outstanding PRs as possible before 
merging this, possibly merge the PR early in the release cycle once consensus 
is established)
   *   a plan is shared with the community as to when and how we migrate to 
log4j 2.x, and any manual upgrade/migration steps are documented; a notice may 
be released on the project blog or in the next release notes
   *   it meets the community guidelines of passing reviews, CI/build tests 
(esp smoketests across all supported distros and hypervisors), and wrt release 
perhaps also the upgrade tests

The PR's current approach of replacing reload4j with log4j 2.x does not reduce 
the impact on the community in the future if there is another breaking library 
change. I like Daan's suggestion of having a log-agnostic logging utility if 
it's possible to accommodate.


Regards.

________________________________
From: Daniel Salvador <dvsalvador...@gmail.com>
Sent: Thursday, May 11, 2023 05:43
To: us...@cloudstack.apache.org <us...@cloudstack.apache.org>
Cc: dev@cloudstack.apache.org <dev@cloudstack.apache.org>
Subject: Re: ACS upgrade to Log4J2 version 2.19

Daan,

IMHO, proxying libraries would bring us more disadvantages than benefits.
If we define that the standard is to proxy libraries such as Utils, Log4J,
and others, we would have to proxy all of them (even Spring); it would
require a huge effort (way more than this work with Log4j) to introduce and
maintain those proxies. Also, ACS already is a complex system to work;
proxying libraries would add one more layer of complexity to the platform
(and one more point of failure); thus, making it difficult for new
contributors to join the community.

The community already put a lot of effort into the Log4J2 upgrade. The PR
being discussed in this thread is ready for merge (besides the conflicts
that the author is resolving as soon as they appear), it has been tested
(it could benefit from more testing though) with a variety of setups and
use cases, and the impact on conflicts is minimal (giving its size and
nature); therefore, along with what I pointed out, I do not think proxying
libraries would be a good approach.

Furthermore, to conclude. If the community seems to count on support of
both users and other devs, and we have already done quite an extensive test
round, what else are we missing to move along with that PR? Is any of the
merging criteria missing to move on with it?

Best regards,
Daniel Salvador (gutoveronezi)

On Tue, May 9, 2023 at 1:52 PM Daan Hoogland <daan.hoogl...@gmail.com>
wrote:

Rohit, Daniel,

Let me weigh in a bit. I held back about this as I have discussed this in
the past a lot in different settings and do not want to take sides. My
personal motivation for either reload4j or log4j2 are slim but I can see
that people might want to lean to either. For this reason I want to bring
up another pet preference of mine:

I think we should proxy each and every external library we use with our own
packages. This means not even using something like slf4j but implement our
own framework and let people choose either on package or on deploytime what
to use, with a reasonable default of course.
At the moment I would lean towards log4j2 as the default but I do not want
my clients to have to use that as I know some of them will blame me for
extra night shifts if we do.

This pet preference is valid for more than logging but is very applicable
on the subject.

This will require some extra effort, so feel free to push back but I think
it will be worth it.

that all said the standardisation of the calling side of things is going to
be a pain but should be done as well. (imnsho)

regards,


On Sat, May 6, 2023 at 3:31 AM Daniel Salvador <gutoveron...@apache.org>
wrote:

Thanks, Rohit for the reply

Not only the author, but me and other colleagues from the community can
build DEB and RPM packages (commands [1] and [2]). We already did, and
the
process to build RPM and DEB works just fine.

Also, basic CloudStack installation, setup and zone deployment, VM and
volume lifecycles with KVM and VMware, and so on, have already been
tested
and that was reported on the PR and this thread. The problem here is that
blueorangutan is failing and it is a black box; therefore, the author
does
not know what is happening inside it.

The conflicts that might happen in other PRs is mostly renaming the
loggers, which the fixes can be easily automated via scripts. For the
efforts on the release management, I am pretty sure some colleagues from
the community would be glad to help as well.

We understand the process and everybody seems to agree on this step
forward; is there any technical reason not to? Anyone else have thoughts
on
this?

Best regards,
Daniel Salvador (gutoveronezi)

[1] docker run -v /tmp:/mnt/build -v ~/.m2:/root/.m2 -e "USER_ID=$(id
-u)"
-e "USER_GID=$(id -g)" -e "ACS_BUILD_OPTS=-Dnoredist"
scclouds/cloudstack-deb-builder:ubuntu2004-jdk11-python3 --git-remote
https://github.com/apache/cloudstack.git --git-ref refs/pull/7131/head
[2] docker run -v /tmp:/mnt/build -v ~/.m2:/root/.m2 -e "USER_ID=$(id
-u)"
-e "USER_GID=$(id -g)"  scclouds/cloudstack-rpm-builder:centos7-jdk11
--distribution centos7 --pack noredist --git-remote
https://github.com/apache/cloudstack.git --git-ref refs/pull/7131/head


On Thu, May 4, 2023 at 3:45 AM Rohit Yadav <rohit.ya...@shapeblue.com>
wrote:

Daniel,

On your remarks;

   *   The currently logging process indeed allows us to make changes to
the log config xml which is read by the framework to affect what is
logged,
without restarting say the management server, kvm agent or the
ssvm/cpvm
agents. Few of my colleagues have confirmed this works already. I'm not
sure about any other features (or lack of them).

   *   You're right about Reload4j's project statement to not add major
changes, however, they state they'll do performance improvements,
enhancements, and (security) fixes. I think this meets the requirements
of
having a stable, reliable logging library. But you're right about the
general innovation argument.

That said if we've features we require, then I would look at (a) can
there
be a central library framework/utility allowing users to choose what
library they want to use, or (b) let's just agree and move to the
proposed
log4j2.x if there are clear advantages.

 From what I've learned so far I don't see such clear advantages for the
use cases I'm thinking or I've considered. I would be wrong by a mile,
but
happy to hear what others think.

   *   On testing the PR, those of us with access can certainly help as
we
help on all community PRs to review and help test. Pl go ahead and ask
any
specifics, I'm sure we'd be happy to assist. I see a few contributors
with
access are already helping.

Another option the authors have is to try and build the rpm and deb
packages themselves (we've documentation I think, and container images
that
BO/Trilian uses here https://hub.docker.com/u/shapeblue). With such
packages, the authors can at least test basic CloudStack installation,
setup and zone deployment, VM, volume lifecycles. They may also try to
deploy such env if packages are available and run smoketests using mbx
which is a lite-version of BO/Trillian CI system, or that may be done
even
manually (say with KVM in a VM?).

   *   We shouldn't favour a particular pull request over others, or for
that matter one library vs another; however, as with anything that
impacts
the community, potentially cause overhead/work, the community must try
to
review the facts, pros, and cons, see how to reduce the impact, build
support in the community, define and agree on a plan and then execute
with
the community. The right path can be hard and long.

We've in fact waited long time (sometimes a year or more) on many such
large-impact initiatives, for example getting the ant-design/vue-based
UI
to deprecate the legacy UI probably took about two years, off recently
for
the vue3 upgrade PR we had to wait for 2-3 major releases, the tungsten
support PR is another one I can think off, and initiatives such as
getting
the go-sdk and terraform plugin donated in the community took long and
painful months to get them done. In all these cases, the issue of
having
daily conflicts or building consensus, or having to wait long term was
unavoidable and painful to stakeholders, but in hindsight they were the
right approaches compliant to ASF, the community did the right thing
for
themselves.


Regards.

________________________________
From: Daniel Salvador <gutoveron...@apache.org>
Sent: Thursday, May 4, 2023 02:28
To: dev@cloudstack.apache.org <dev@cloudstack.apache.org>
Cc: us...@cloudstack.apache.org <us...@cloudstack.apache.org>
Subject: Re: ACS upgrade to Log4J2 version 2.19

Hi all,

Nux, thanks for the advice on communication, it makes perfect sense.

Regarding Rohit's points, these are my thoughts about them:

- Reload4J is a project with the goal of fixing pressing security
issues
for 1.2.17 and its community does not seem to have the goal of adding
new
features, as its own documentation claims (it is not a demerit of the
project, it is just a characteristic of it). This holds us to a limited
technology and does not allow us to make progress in this area of the
project. We are 10 years behind logging technologies and the more we
wait,
the harder it will be to advance.

- All the features João has mentioned will facilitate the
troubleshooting
and the coding process. Based on my experience, the most important
points
of them in our current context are (i) the automatic reload of
configurations, which allows us to change the configurations without
having
to restart the service and losing the context for the troubleshooting,
and
(ii) the improvements on the code writing that comes with its features
(lambda support, message support, and others).

- With respect to the side effects of the patch, indeed, it is expected
to
have conflicts on the pull requests. However, most of them (if not all)
will be related to the logging package import and logger renaming, as
Rohit
have pointed out; thus, the major work would be renaming the loggers to
the
standardized name. Perhaps João could share with us the script he used
to
standardize the loggers names, as he already did with the other scripts
he
used (which are in the PR description).

- Almost daily, CloudStack receives pull requests; this way, if we wait
for
the PRs to be merged first, we would join in an endless loop and would
not
be able to progress with this area. In this scenario, I think the best
strategy would be to merge this patch first and then notify everybody
about
the merging along with the script to fix the conflicts. Also, the
earlier
we merge the patch, the earlier we can notify everybody and help them
with
the normalization (if necessary), and we would have plenty of time to
address it in 4.19 (if the PR causes any issues).

- Regarding the stability of the patch, there are reports in the PR of
people testing it manually; and if someone else could test this patch
to
generate more evidence, it would be great. I see that the author is
taking
care of the GitHub actions/build errors whenever they appear (last
time I
checked - today - everything was fine); however, the blueorangutan's
environment and workflows are restricted to only a few people, which
makes
it difficult to the author to discover what is causing the "Trillian
Build
Failed" messages. In fact, help from someone with access to enable us
to
discover the reasons behind the failed builds has already been
requested
in
the PR weeks ago; however, no one answered; if someone with access
could
provide logs or information about the blueorangutan situation, I am
pretty
sure we could find a solution to a problem, if there is anything else
left
in the patch that is breaking the build.

- The approach (OOP's inheritance concept) used for addressing the tech
debt reduces the surface of impact for further improvements. Let's say
we
have to upgrade to Log4J 3.x in the future; instead of having 2000+
classes
to change, we would have to change just a few classes. We could reduce
even
more the surface of impact if we normalize other aspects of the code
regarding OOP and Spring (it can be handled in other PRs, for
instance).
With that, and based on the points I mentioned in the String libs
discussion we had some time ago [1], the idea of creating a 1x1 facade
for
the loggers does not seem interesting to me.

These are my humble thoughts about these topics, and I agree the
community
must reach a consensus about them.

Best regards,
Daniel Salvador (gutoveronezi)

[1] https://lists.apache.org/thread/cmz5fgz5g32h9onk10b9n8561v6jp79q

On Wed, May 3, 2023 at 8:35 AM Rohit Yadav <rohit.ya...@shapeblue.com>
wrote:

All,

I would like to bring attention to and refer to a comment I made on
the
PR
a few weeks ago

https://github.com/apache/cloudstack/pull/7131#issuecomment-1488671229
and good to see this finally being discussed. Thanks João for
starting
this
ML thread and all the hard work you and other contributors have put
into
this.

I took some time to reflect on the discussions so far, gather facts,
share
my findings and thoughts for the community to review, consider and
discuss:
   1.
I'm happy and it's encouraging to see all this hard work and new
contributors who want to be RM. In fact, lately, I've transitioned
into
such a mentoring role both in the community and at work, and willing
to
help guide and mentor such RMs.

It's important for our community to welcome contributions, all
contributors equally deserve a chance. The community should consider
the
facts, weigh in all the benefits and cons of any large contributions,
the
stability/risks associated, and discuss and build consensus if this
is
the
right thing for them. The project bylaws also give us tools and
mechanisms
to build such consensus, and resolve difference of opinions or to
formally
agree via voting.

   2.
We aren't using log4j1.x. In fact, we're using Reload4J (
https://reload4j.qos.ch/) which is a maintained log4j1.x fork by
Ceki
(original author, emeritus PMC member of log4j1.x project, and also a
logging expert who's behind sl4j, logback projects). Per the website,
Reload4J is also commercially supported by Exoscale (a fork of
cloudstack),
Spotify among others, and is regularly released. Reload4j is being
used
by
a dozen other top-level ASF projects including CloudStack. Reload4j
has
also released several security and improvement releases even after
the
notorious log4jshell (which we weren't affected as much as the 2.x
users).
I think Ceki is very community-oriented and trustworthy, he was kind
enough to share his insights and confirm on performance
considerations
between log4j 2.x vs 1.x/reload4j and that several features including
async
logger is in his roadmap. I'll add details in a sub-note if anybody
wants
to refer [1].

 From a dependency risk, longivity risk of using reload4j and security
perspective for CloudStack, I see no technical reason to rush or
there
exists an urgency to move away from reloadj4j as our logging library.
It
meets our basic requirements for a logging library well.

   3.  There is no mandate or bylaws from the ASF that we must use
log4j2x
or any ASF-only library. The CloudStack project is free to use
whatever
dependencies we think are good and in favour of the wider community.

   4.  While I see a long list of features from log4j2.x, which
particular
features or benefits do we really want that aren't already available
to
use
via reload4j?

This is an important technical question to discuss - are we doing the
work
simply because there is a new library/version, or there are serious
requirements? I genuinely want to learn what changed.

   1.
 From what I gather, there are side effects of merging the PR
https://github.com/apache/cloudstack/pull/7131 that we should
consider,
review, discuss, find solutions to them, and try to resolve (or
gather
consensus) on how to proceed:

      *   Once merged, it will potentially cause conflicts on more
than
100
outstanding pull requests (potentially any/all java changes).

Have we thought about which approach we'll like to take, try to have
these
100+ outstanding PRs merged in first or go ahead with this PR first?
Should
we give anykind of notice/documentation to them? What is the right
approach
for the community?

      *   Once merged, it will also cause conflicts and build failures
on
all PRs that are forward merged to a branch that has this PR merged.
(mostly due to logging package import and logger name changes). This
will
add overhead in older branch maintenance, though the health check PRs
can
be used to ensure stability of such forward-merges.

      *   This will impact and affect all non-public/private
integrations,
plugins, and vendors such as (Storpool, Linbit, and others) that are
building anything with, using, or on top of CloudStack. It may also
affect
their ability to deliver integration/feature/plugin jars which may
not
work
out of the current process of replacing them in the classpath (I know
Storpool and Linstor uses this approach at
/usr/share/cloudstack-management/libs/ as we don't bundle these jars
in
the
ACS mgmt uber-jar).

      *   Users - CloudStack release upgrades may not work out of the
box
for users who have made changes to their logging config, or are using
any
kind of integrations, plugins etc currently. I'm assuming the
reload4j/log4j 1.x config xml isn't compatible 100% with log4j2.x (I
could
be wrong). Some of the users may be even blocked to upgrade or use a
version of CloudStack until their dependencies, plugins/jars,
integrations
etc are updated (by them or their vendors) to work with CloudStack
log4j2.x.
However, this could be dealt with upgrade paths (in packages
post-install
script) to rename/replace with new log4j2.x config file and
documentation/notice in the release announcement and release notes.

Have we thought about giving any kind of deprecation/change notice to
the
wider community, and what kind of time frame or period is appropriate
for
them? Which release should this work be targeted for? Should we give
at
least six months notice and documentation for users, and target this
for
4.20?

   2.  I have concerns about the stability and risks of accepting the
PR,
I
check the PR and see build/CI failures on the PR. From the surface,
it
looks like the PR isn't nearly ready, and still requires a lot of
stabilisation effort (or does it also require changes in the CI
systems?).
The changes are huge so reviewing 100% is not possible, are the scope
changes strictly logging replacement or there are other changes?

What is the current state of the PR, is it ready? Has anybody looked
at
the failures - why are they happening, can we fix them? As with any
PR,
the
PR must pass all build and CI checks. Github Actions, BO/Trillian
based
test matrix across KVM, VMware, XenServer/XCP-ng can help (we can
help
kick
those tests).

What is also missing are considerations for manual testing of the
logs
(for ex. to consider has the logging format, file, etc changes - are
there
any other side-effects we haven't anticipated but might learn?). I
think
it's fixable - somebody (non-authors) needs to manually test them -
has
that happened?

   3.  The PR intends to address the tech debt but the approach could
be
improved. It doesn't wrap/refactor the logging library under a
utility
method, so one downside of this approach I see is if in the future
there
is
a new log4j 3.x library or something different, the community would
need
to
go through all of this again (perhaps would be easier with consistent
naming, still doesn't make it risk or effort free).

Does it make sense to refactor and create a wrapper utility method
(in
cloud-utils module), that allows a community to easily replace one
logging
library with another?

   4.  I think a round of discussion and some iterations are necessary
to
evaluate all concerns,  work towards a concrete proposal that should
weigh
in, and propose/addres how exact we transition to log4j 2.x if at
all.
Ideally, for such a large change we should give notice and a longer
time
period for the wider community to react, even bring it up, discuss in
meetups and conferences to build support so this wouldn't come as a
shock
to people or affect them - sort of prepare the soil, before we sow
the
seeds.

Hope this helps.

[1] I discussed https://github.com/qos-ch/reload4j/issues/47 and
then
for
my query got email from Ceki/Reload4j on async logger, also

Here are a few comments that you might find helpful.

In my opinion asynchronous loggers as they exist in log4j 2.x (based
on
LMAX Disruptor) use an extremely large buffer size. This buffer will
fill up in case of back pressure. There are two major cases.

Either:

1) all downstream appenders are fast

In that case, a very large buffer is useless as a reasonably sized
buffer will offer good performance.  For example, benchmarks [1] have
shown that logback's AsyncAppender is both simpler and significantly
faster.

2) at least one downstream appender is slow

In which case, the very large buffer will slowly fill up, possibly
exhausting available memory. More importantly, in case the
application
is stopped, this will result in large amount of logging data to be
lost,
which is far from ideal for a logging framework.

By the way, I intend to add  JsonLayout to reload4j in the coming
days.
JsonLayout will not require additional dependencies and should be
quite
useful.

Take care and best regards,

[1] https://logback.qos.ch/performance.html



Regards.

________________________________
From: Abhishek Kumar <shwst...@apache.org>
Sent: Wednesday, May 3, 2023 15:16
To: us...@cloudstack.apache.org <us...@cloudstack.apache.org>
Cc: dev@cloudstack.apache.org <dev@cloudstack.apache.org>
Subject: Re: ACS upgrade to Log4J2 version 2.19

Hi Daniel,

It was just my opinion it is based on the reasons that it is
something
that
we haven't seen any request in the community before and it will
create
some
challenges for the releases, forward-merging bug-fixes and also for
any
existing integrations that users might be having.
To be specific, I'm neutral (0) on this.
Great that you want to take up the RM work to address part of those
challenges. You have my support.
The only thing that I would hope for is to get this tested/merged
early
to
avoid issues at the later stages nearer to the release.

Regards,
Abhishek

On Mon, 1 May 2023 at 18:58, Daniel Salvador <
dvsalvador...@gmail.com>
wrote:

Abhishek,

I do not see why it would be a 5.0 change. Also, ACS 5.0 is a
discussion
the community has been having for a long time from now and is
something
we
are too far away to achieve consensus.

The patch is important to enable further development for the log
management
on ACS and facilitate everyone's life while coding and
troubleshooting.
If
you think it is too much work for the RM, I reiterate that I am
willing
to
be the 4.19 RM and conduct/execute all of the work.

Best regards,
Daniel Salvador (gutoveronezi)

On Mon, May 1, 2023 at 4:10 AM Abhishek Kumar <shwst...@apache.org
wrote:
Great work.
Though I feel this is a 5.0 change. I agree with Wei that this
would
create
too much overhead for upcoming releases. 4.18 was pushed ahead a
few
months
and we may end up on a similar path.
Also, reload4j is still actively maintained so I don't think this
is
urgent.

Regards,
Abhishek

On Fri, 28 Apr 2023 at 18:28, João Jandre Paraquetti <
j...@scclouds.com.br
wrote:

In PR #7131 (https://github.com/apache/cloudstack/pull/7131) I
have
proposed to normalize ACS's loggers, and more importantly,
upgrade
the
library log4j to log4j2 version 2.19.

Log4j2 has a lot of features that could offer benefits to ACS:

   * Async Loggers - performance similar to logging switched off
   * Custom log levels
   * Automatically reload its configuration upon modification
without
     loosing log events during reconfigurations.
   * Java 8-style lambda support for lazy logging (which enables
methods
     to be executed only when necessary, i.e.: the right log
level)
   * Log4j 2 is garbage-free (or at least low-garbage) since
version
2.6
   * Plugin Architecture - easy to extend by building custom
components
   * Log4j 2 API is separated from the Log4j 2 implementation.
   * Log4j 2 API supports more than just logging Strings:
CharSequences,
     Objects and custom Messages. Messages allow support for
interesting
     and complex constructs to be passed through the logging
system
and
     be efficiently manipulated. Users are free to create their
own
     Message types and write custom Layouts, Filters and Lookups
to
     manipulate them.
   * Concurrency improvements: log4j2 uses java.util.concurrent
libraries
     to perform locking at the lowest level possible. Log4j-1.x
has
known
     deadlock issues.
   * Configuration via XML, JSON, YAML, properties configuration
files
or
     programmatically.

In my personal experience using it in some other projects,
log4j2
is
easier to work with in general, has better performance, and is
an
active
project with constant development, innovation, and security
patches.
Moreover, it is under a well known and trusted open source
organization.
The change proposed in PR #7131
(https://github.com/apache/cloudstack/pull/7131) has been
tested
and
validated in a lot of different scenarios by different people.
We
have
already tested the logging in the management server, usage,
agents,
and
system VMs; all of that using KVM and Vmware + Veeam. Most
feature
sets
were tested, create/delete/update VMs, disks, cresate
snapshots,
user
management and so on.

The proposal has been discussed since January, 2023 in the PR
(https://github.com/apache/cloudstack/pull/7131), but I have
been
requested to bring it to the mailing list. I would love to hear
your
opinions on it, also, any reviews to the PR would be welcome.







--
Daan

Reply via email to