Apologies on delay, been busy with +1. ** Description changed:
+ [ Impact ] This bug tracks an update for the rabbitmq-server package in Ubuntu. - This bug tracks an update to the following versions: - * Focal (20.04): rabbitmq-server 3.8.3 * Jammy (22.04): rabbitmq-server 3.9.27 - (NOTE) - Jammy is only updating to 3.9.27 because 3.9.28 requires Erlang 24.3. If Erlang updates in the future, then we can upgrade further. (NOTE) - Focal is only updating to 3.8.3 from 3.8.2 because 3.8.4 requires etcd v3.4. + This is the first MRE of rabbitmq-server. + Upstream has a very rapid release cadence with micro releases that contain many bug fixes that would be good to bring into our LTS releases. + One major hurdle with this is the lack of proper dep8 tests, which a limited suite of dep8 tests were created for this MRE, which is planned to get integrated into newer releases once approved. + rabbitmq-server is a complicated package that the new dep8 tests will not be able to cover everything, therefore our openstack charms CI/CD ran the new verison to provide more confidence in the package, and to at least verify that our workflow works. The results of these runs can be found at https://review.opendev.org/c/openstack/charm-rabbitmq-server/+/915836. + In addition to this, only Jammy has github workflows to build+test the package, where the results can be found at https://github.com/mitchdz/rabbitmq-server-3-9-27-tests/actions/runs/8955069098/job/24595393599. - This is the first MRE of rabbitmq-server. + Changelogs can be found at https://github.com/rabbitmq/rabbitmq- + server/tree/main/release-notes - Upstream has a very rapid release cadence with micro releases that - contain many bug fixes that would be good to bring into our LTS - releases. + [ Test Plan ] + The test plan for rabbitmq-server involves 3 different types of tests. - One major hurdle with this is the lack of proper dep8 tests, which a - limited suite of dep8 tests were created for this MRE, which is planned - to get integrated into newer releases once approved. + 1. OpenStack CI/CD + This is what we run for CI/CD. Testing the newer version in CI/CD tests real world use-cases, and is a minimum that should be done to ensure our own tooling should work. Tester will need to request the new version to be ran from the OpenStack team. An example of a run as mentioned before is: - rabbitmq-server is a complicated package that the new dep8 tests will - not be able to cover everything, therefore our openstack charms CI/CD - ran the new verison to provide more confidence in the package, and to at - least verify that our workflow works. The results of these runs can be - found at https://review.opendev.org/c/openstack/charm-rabbitmq- - server/+/915836. + https://review.opendev.org/c/openstack/charm-rabbitmq-server/+/915836 - In addition to this, only Jammy has github workflows to build+test the - package, where the results can be found at - https://github.com/mitchdz/rabbitmq- - server-3-9-27-tests/actions/runs/8955069098/job/24595393599. + 2. dep8 tests + New dep8 tests were added to the package which must pass. These cover simple, but real use-cases. - Reviewing the changes, there is only one change that I want to bring to attention. That is version 3.9.23 (https://github.com/rabbitmq/rabbitmq-server/releases/tag/v3.9.23 ) introduces the following change: - Nodes now default to 65536 concurrent client connections instead of using the effective kernel open file handle limit + 3. Upgrade testing + 1. lxc launch ubuntu:jammy j-vm --vm + 2. lxc shell j-vm + 3. sudo apt install -y rabbitmq-server + 4. Enable proposed + 5. sudo apt install -y rabbitmq-server + # ensure no errors or issues during upgrade - ------------------------------------------------------------------------------ - - Jammy Changes: - - Notices: - + Nodes now default to 65536 concurrent client connections instead of - using the effective kernel open file handle limit. Users who want to - override this default, that is, have nodes that should support more - concurrent connections and open files, now have to perform an additional - configuration step: - - 1. Pick a new limit value they would like to use, for instance, 100K - 2. Set the maximum open file handle limit (for example, via `systemd` - or similar tooling) for the OS user used by RabbitMQ to 100K - 3. Set the ERL_MAX_PORTS environment variable to 100K - - This change was introduced because of a change in several Linux - distributions: they now use a default open file handle limit so high, - they cause a significant (say, 1.5 GiB) memory preallocated the Erlang - runtime. - - Updates: - + Free disk space monitor robustness improvements. - + `raft.adaptive_failure_detector.poll_interval` exposes aten()'s - poll_interval setting to RabbitMQ users. Increasing it can reduce the - probability of false positives in clusters where inter-node - communication links are used at close to maximum capacity. The default - is `5000` (5 seconds). - + When both `disk_free_limit.relative` and `disk_free_limit.absolute`, - or both `vm_memory_high_watermark.relative` and - `vm_memory_high_watermark.absolute` are set, the absolute settings will - now take precedence. - + New key supported by `rabbitmqctl list_queues`: - `effective_policy_definition` that returns merged definitions of regular - and operator policies effective for the queue. - + New HTTP API endpoint, `GET /api/config/effective`, returns effective - node configuration. This is an HTTP API counterpart of - `rabbitmq-diagnostics environment`. - + Force GC after definition import to reduce peak memory load by mostly - idle nodes that import a lot of definitions. - + A way to configure an authentication timeout, much like in some other - protocols RabbitMQ supports. - + Windows installer Service startup is now optional. More environment - variables are respected by the installer. - + In environments where DNS resolution is not yet available at the time - RabbitMQ nodes boot and try to perform peer discovery, such as CoreDNS - with default caching interval of 30s on Kubernetes, nodes now will - retry hostname resolution (including of their own host) several times - with a wait interval. - + Prometheus plugin now exposes one more metric process_start_time_seconds - the moment of node process startup in seconds. - + Reduce log noise when `sysctl` cannot be accessed by node memory - monitor. - + Shovels now handle consumer delivery timeouts gracefully and restart. - + Optimization: internal message GUID is no longer generated for quorum - queues and streams, as they are specific to classic queues. - + Two more AMQP 1.0 connection lifecycle events are now logged. - + TLS configuration for inter-node stream replication connections now can - use function references and definitions. - + Stream protocol connection logging is now less verbose. - + Max stream segment size is now limited to 3 GiB to avoid a potential - stream position overflow. - + Logging messages that use microseconds now use "us" for the SI symbol to - be compatible with more tools. - + Consul peer discovery now supports client-side TLS options, much like - its Kubernetes and etcd peers. - + A minor quorum queue optimization. - + 40 to 50% throughput improvement for some workloads where AMQP 0-9-1 - clients consumed from a [stream](https://rabbitmq.com/stream.html). - + Configuration of fallback secrets for Shovel and Federation credential - obfuscation. This feature allows for secret rotation during rolling - cluster node restarts. - + Reduced memory footprint of individual consumer acknowledgements of - quorum queue consumers. - + `rabbitmq-diagnostics status` now reports crypto library (OpenSSL, - LibreSSL, etc) used by the runtime, as well as its version details. - + With a lot of busy quorum queues, nodes hosting a moderate number of of - leader replicas could experience growing memory footprint of one of the - Raft implementation processes. - + Re-introduced key file log rotation settings. Some log rotation settings - were left behind during the migration to the standard runtime logger - starting with 3.9.0. Now some key settings have been re-introduced. - + Cleaned up some compiler options that are no longer relevant. - + Quorum queues: better forward compatibility with RabbitMQ 3.10. - + Significantly faster queue re-import from definitions on subsequent node - restarts. Initial definition import still takes the same amount of time - as before. - + Significantly faster exchange re-import from definitions on subsequent - node restarts. Initial definition import still takes the same amount of - time as before. - + RabbitMQ nodes will now filter out certain log messages related to - connections, channels, and queue leader replicas receiving internal - protocol messages sent to this node before a restart. These messages - usually raise more questions and cause confusion than help. - + More Erlang 24.3's `eldap` library compatibility improvements. - + Restart of a node that hosted one or more stream leaders resulted in - their consumers not "re-attaching" to the newly elected leader. - + Large fanouts experienced a performance regression when streams were not - enabled using a feature flag. - + Stream management plugin did not support mixed version clusters. - + Stream deletion did not result in a `basic.cancel` being sent to AMQP - 0-9-1 consumers. - + Stream clients did not receive a correct stream unavailability error in - some cases. - + It is again possible to clear user tags and update the password in a - single operation. - + Forward compatibility with Erlang 25. - + File handle cache efficiency improvements. - + Uknown stream properties (e.g. those requested by a node that runs a - newer version) - are now handled gracefully. - + Temporary hostname resolution issues-attempts that fail with `nxdomain` - are now handled more gracefully and with a delay of several seconds. - + Build time compatibility with Elixir 1.13. - + `auth_oauth2.additional_scopes_key` in `rabbitmq.conf` was not converted - correctly during configuration translation and thus had no effect. - + Adapt to a breaking Erlang 24.3 LDAP client change. - + Shovels now can be declared with `delete-after` parameter set to `0`. - Such shovels will immediately stop instead of erroring and failing to - start after a node restart. - + Support for Consul 1.1 response code changes - when an operation is attempted on a non-existent health check. - - Bug Fixes: - + Classic queues with Single Active Consumer enabled could run into an - exception. - + When a global parameter was cleared, - nodes emitted an internal event of the wrong type. - + Fixed a type analyzer definition. - + LDAP server password could end up in the logs in certain types of - exceptions. - + `rabbitmq-diagnostics status` now handles server responses where free - disk space is not yet computed. This is the case with nodes early in the - boot process. - + Management UI links now include "noopener" and "noreferrer" attributes - to protect them against reverse tabnabbing. Note that since management - UI only includes a small number of external links to trusted resources, - reverse tabnabbing is unlikely to affect most users. However, it can - show up in security scanner results and become an issue in environments - where a modified version of RabbitMQ is offered as a service. - + Plugin could stop in environments where no static Shovels were defined - and a specific sequence of events happens at the same time. - + When installation directory was overridden, the plugins directory did - not respect the updated base installation path. - + Intra-cluster communication link metric collector could run into an - exception when peer connection has just been re-established, e.g. after - a peer node restart. - + When a node was put into maintenance mode, it closed all MQTT client - connections cluster-wide instead of just local client connections. - + Reduced log noise from exceptions connections could run into when a - client was closings it connection end concurrently with other activity. - + `rabbitmq-env-conf.bat§ on Windows could fail to load when its path - contained spaces. - + Stream declaration could run into an exception when stream parameters - failed validation. - + Some counters on the Overview page have been moved to global counters - introduced in RabbitMQ 3.9. - + Avoid an exception when MQTT client closes TCP connection before server - could fully process a `CONNECT` frame sent earlier by the same client. - + Channels on connections to mixed clusters that had 3.8 nodes in them - could run into an exception. - + Inter-node cluster link statistics did not have any data when TLS was - enabled for them. - + Quorum queues now correctly propagate errors when a `basic.get` (polling - consumption) operation hits a timeout. - + Stream consumer that used AMQP 0-9-1 instead of a stream protocol - client, and disconnected, leaked a file handle. - + Max frame size and client heartbeat parameters for [RabbitMQ stream]() - clients were not correctly set when taken from `rabbitmq.conf`. - + Removed a duplicate exchange decorator set operation. - + Node restarts could result in a hashing ring inconsistency. - + Avoid seeding default user in old clusters that still use the deprecated - `management.load_definitions` option. - + Streams could run into an exception or fetch stale stream position data - in some scenarios. - + `rabbitmqctl set_log_level` did not have any effect on logging via - `amq.rabbitmq.log`. - + `rabbitmq-diagnostics status` is now more resilient and won't fail if - free disk space monitoring repeatedly fails (gets disabled) on the node. - + CLI tools failed to run on Erlang 25 because of an old version of Elixir - (compiled on Erlang 21) was used in the release pipeline. Erlang 25 no - longer loads modules - compiled on Erlang 21 or older. - + Default log level used a four-character severity abbreviation instead of - more common longer format, for example, `warn` instead of `warning`. - + `rabbitmqctl set_log_level` documentation clarification. - + Nodes now make sure that maintenance mode status table exists after node - boot as long as the feature flag is enabled. - + "In flight" messages directed to an exchange that has just been deleted - will be silently dropped or returned back to the publisher instead of - causing an exception. - + rabbitmq-upgrade await_online_synchronized_mirror is now a no-op in - single node clusters - + One metric that was exposed via CLI tools and management plugin's HTTP - API was not exposed via Prometheus scraping API. - + Stream delivery rate could drop if concurrent stream consumers consumed - in a way that made them reach the end of the stream often. - + If a cluster that had streams enabled was upgraded with a jump of - multiple patch releases, stream state could fail an upgrade. - + Significantly faster queue re-import from definitions on subsequent node - restarts. Initial definition import still takes the same amount of time - as before. - + When a policy contained keys unsupported by a particular queue - type, and later updated or superseded by a higher priority policy, - effective optional argument list could become inconsistent (policy - would not have the expected effect). - + Priority queues could run into an exception in some cases. - + Maintenance mode could run into a timeout during queue leadership - transfer. - + Prometheus collector could run into an exception early on node's - schema database sync. - + Connection data transfer rate units were incorrectly displayed when - rate was less than 1 kiB per second. - + `rabbitmqadmin` now correctly loads TLS-related keys from its - configuration file. - + Corrected a help message for node memory usage tool tip. - * Added new dep8 tests: - - d/t/hello-world - - d/t/publish-subscribe - - d/t/rpc - - d/t/work-queue - * Remove patches fixed upstream: - - d/p/lp1999816-fix-rabbitmqctl-status-disk-free-timeout.patch - - ------------------------------------------------------------------------------ - - Focal Changes: - * New upstream verison 3.8.3 (LP: #2060248). - - Updates: - + Some Proxy protocol errors are now logged at debug level. - This reduces log noise in environments where TCP load balancers and - proxies perform health checks by opening a TCP connection but never - sending any data. - + Quorum queue deletion operation no longer supports the "if unused" and - "if empty" options. They are typically used for transient queues don't - make much sense for quorum ones. - + Do not treat applications that do not depend on rabbit as plugins. - This is especially important for applications that should not be stopped - before rabbit is stopped. - + RabbitMQ nodes will now gracefully shutdown when receiving a `SIGTERM` - signal. Previously the runtime would invoke a default handler that - terminates the VM giving RabbitMQ no chance to execute its shutdown - steps. - + Every cluster now features a persistent internal cluster ID that can be - used by core features or plugins. Unlike the human-readable cluster name, - the value cannot be overridden by the user. - + Speedup execution of boot steps by a factor of 2N, where N is the number - of attributes per step. - + New health checks that can be used to determine if it's a good moment to - shut down a node for an upgrade. - - ``` sh - # Exits with a non-zero code if target node hosts leader replica of at - # least one queue that has out-of-sync mirror. - rabbitmq-diagnostics check_if_node_is_mirror_sync_critical - - # Exits with a non-zero code if one or more quorum queues will lose - # online quorum should target node be shut down - rabbitmq-diagnostics check_if_node_is_quorum_critical - ``` - + Some proxy protocol errors are now logged at debug level. This reduces - log noise in environments where TCP load balancers and proxies perform - health checks by opening a TCP connection but never sending any data. - + Management and Management Agent Plugins: - * An undocumented "automagic login" feature on the login form was - removed. - * A new `POST /login` endpoint can be used by custom management UI login - forms to authenticate the user and set the cookie. - * A new `POST /rebalance/queues` endpoint that is the HTTP API equivalent - of `rabbitmq-queues rebalance` - * Warning about a missing `handle.exe` in `PATH` on Windows is now only - logged every 10 minutes. - * `rabbitmqadmin declare queue` now supports a new `queue_type` parameter - to simplify declaration of quorum queues. - * HTTP API request log entries now includes acting user. - * Content Security Policy headers are now also set for static assets such - as JavaScript files. - + Prometheus Plugin: - * Add option to aggregate metrics for channels, queues & connections. - Metrics are now aggregated by default (safe by default). - + Kubernetes Peer Discovery Plugin: - * The plugin will now notify Kubernetes API of node startup and peer - stop/unavailability events. This new behaviour can be disabled via - `prometheus.return_per_object_metrics = true` config. - + Federation Plugin: - * "Command" operations such as binding propagation now use a separate - channel for all links, preventing latency spikes for asynchronous - operations (such as message publishing) (a head-of-line blocking - problem). - + Auth Backend OAuth 2 Plugin: - * Additional scopes can be fetched from a predefined JWT token field. - Those scopes will be combined with the standard scopes field. - + Trust Store Plugin: - * HTTPS certificate provider will not longer terminate if upstream - service response contains invalid JSON. - + MQTT Plugin: - * Avoid blocking when registering or unregistering a client ID. - + AMQP 1.0 Client Plugin: - * Handle heartbeat in `close_sent/2`. - - Bug Fixes: - + Reduced scheduled GC activity in connection socket writer to one run per - 1 GiB of data transferred, with an option to change the value or disable - scheduled run entirely. - + Eliminated an inefficiency in recovery of quorum queues with a backlog of - messages. - + In a case where a node hosting a quorum queue replica went offline and - was removed from the cluster, and later came back, quorum queues could - enter a loop of Raft leader elections. - + Quorum queues with a dead lettering could fail to recover. - + The node now can recover even if virtual host recovery terms file was - corrupted. - + Autoheal could fail to finish if one of its state transitions initiated - by a remote node timed out. - + Syslog client is now started even when Syslog logging is configured only - for some log sinks. - + Policies that quorum queues ignored were still listed as applied to them. - + If a quorum queue leader rebalancing operation timed out, CLI tools - failed with an exception instead of a sensible internal API response. - + Handle timeout error on the rebalance function. - + Handle and raise protocol error for absent queues assumed to be alive. - + `rabbitmq-diagnostics status` failed to display the results when executed - against a node that had high VM watermark set as an absolute value - (using `vm_memory_high_watermark.absolute`). - + Management and Management Agent Plugins: - * Consumer section on individual page was unintentionally hidden. - + Management and Management Agent Plugins: - * Fix queue-type select by adding unsafe-inline CSP policy. - + Etcd Peer Discovery Plugin: - * Only run healthcheck when backend is configured. - + Federation Plugin: - * Use vhost to delete federated exchange. - * Added new dep8 tests: - - d/t/smoke-test - - d/t/hello-world - - d/t/publish-subscribe - - d/t/rpc - - d/t/work-queue + For jammy, also ensure ERL_MAX_PORTS and LimitNOFILE are correctly honored on upgrade. + [ Where problems could occur ] + * This is the first MRE of this package, so extra caution should be taken. + * Upgrading the server may cause downtine during upgrade. + * Reports of upgrade failures can happen if users have misconfigured rabbitmq-server and the maintainerscripts attempt to stop/start the server. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2060248 Title: MRE updates of rabbitmq-server for Jammy,Focal To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/rabbitmq-server/+bug/2060248/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
