I posted the issue in github here: https://github.com/networkupstools/nut/issues/2999

I am not sure about labels or what else might be required.

Thank you!

On 7/1/25 03:23, Jim Klimov wrote:
I think yes, seems like a valid bug.

Also as you mention `upsdrvctl`, systemd and NUT v2.8.x together, take a look at https://github.com/networkupstools/nut/wiki/nut%E2%80%90driver%E2%80%90enumerator-(NDE) - it may be more applicable to use `upsdrvsvcctl` instead nowadays.

Jim


On Tue, Jul 1, 2025 at 12:35 AM Vyasa <i...@dalpha.com> wrote:

    Hi Jim,

    Thanks for the prompt response.

    The restart I refer to was exactly as you say.  Where I restarted
    the service using: systemctl restart nut-server.  This was
    separate to where I mention the reboot of server machine, which
    resolves the issue.

    The driver used was:
    Network UPS Tools - UPS driver controller 2.8.0
    Network UPS Tools - BCMXCP UPS driver 0.32 (2.8.0)

    I simulated the fault again, by putting the UPS in bypass and
    disconnecting the battery.  This caused the RB alert again.  With
    this I then reconnected battery, restored UPS to normal operating
    condition.  Then used upsdrvctl to STOP and START the driver.

    Generating alert condition for simulating RB:
    Alert type: REPLBATT
    .....................
    ups.status: ALARM OL BYPASS RB
    ups.test.result: Done and error

    Alert cleared on UPS, and alert condition with RB persisting on
    NUT-SERVER:
    Alert type: ONLINE
    .................
    ups.status: OL RB

    ups.test.result: Done and passed

    Restarting using upsdrvctl start/stop command clears RB:
    Alert type: COMMOK
    ..................
    ups.status: OL
    ups.test.result: Done and passed

    So it seems that your and my suspicions have been verified.  Where
    bcmxcp seems to "latch" the alarm until driver restart or server
    reboot.

    I think you are correct, in that this can cause issues in other
    subsets of real-life cases.  Thinking here of automating and
    scripting and so forth.

    What would you suggest at this point?  Can this be submitted as a bug?

    Vyasa



    On 6/30/25 14:18, Jim Klimov wrote:
    Hello,

      You mention that you've tried restarting the "nut-server" - I
    suppose you mean literally, the service unit by such name - of
    the NUT data server. Did you try restarting the unit for the NUT
    driver (e.g. `systemctl restart nut-drvier@upsname` with NUT
    v2.8.x and newer)?

      You did not mention the driver used, but I wonder if that
    driver program "latches" the RB value when it goes bad and never
    updates it?.. This could make sense when UPS battery replacement
    means server downtime, but that is just a subset of real-life
    cases - so generally can be just an oversight. For example,
    `bcmxcp` code seems to only set
    `bcmxcp_status.alarm_replace_battery=1` (oddly neither the field
    nor struct is ever initialized to 0, so might be garbage on some
    systems/compilers that do not zero-out aggregate types by default).

    Jim


    On Mon, Jun 30, 2025 at 7:53 PM Vyasa via Nut-upsuser
    <nut-upsuser@alioth-lists.debian.net> wrote:

        Hello,

        CONFIGURATION:

        I am using a Powerware PW9120 3000i, on a network
        configuration with a server and a couple of slaves.

        The nut-server OS is /Debian 12 (6.1.0-37-amd64)/.  Nut was
        installed from the Debian repo with version /2.8.0-7 amd64/,
        and client has the same version.

        UPS is connected with a standard RS232 serial connection, and
        works with all standard commands and functionality.

        Command "/upscmd -l upsname/" provides the following, where I
        have successfully used /test.battery.start/ and
        /test.system.start/:

        beeper.disable - Disable the UPS beeper
        beeper.enable - Enable the UPS beeper
        beeper.mute - Temporarily mute the UPS beeper
        load.on - Turn on the load immediately
        outlet.1.load.off - Turn off the load on outlet 1 immediately
        outlet.1.load.on - Turn on the load on outlet 1 immediately
        outlet.1.shutdown.return - Turn off the outlet 1 and return
        when power is back
        outlet.2.load.off - Turn off the load on outlet 2 immediately
        outlet.2.load.on - Turn on the load on outlet 2 immediately
        outlet.2.shutdown.return - Turn off the outlet 2 and return
        when power is back
        shutdown.return - Turn off the load and return when power is back
        shutdown.stayoff - Turn off the load and remain off
        test.battery.start - Start a battery test
        test.system.start - Start a system test

        ISSUE:

        Every couple of years when I have to replace batteries in the
        UPS, I get an issue with not being able to clear the REPLBATT
        alert.  That is not until I reboot the server running
        NUT-SERVER. This might seem as not a big deal, but becomes a
        hassle when batteries haven't quite failed yet and are still
        good after a ups battery test.

        The UPS itself reports OK after battery replacement or
        battery test, and clears alarm on its LCD.  But when I poll
        the UPS data using "upsc upsname" I still see the RB or
        REPLBATT and this will not clear until I reboot the server. 
        So without reboot the alert will then be generated based on
        RBWARNTIME in upsmon.conf, which is as per nut design.

        So without reboot I always get the RB flag with status:

        /Alert type: REPLBATT/
        /............/
        /ups.status: OL RB/
        /ups.test.result: Done and passed/

        After reboot of server the alert is cleared:

        /Alert type: COMMOK
        ............
        ups.status: OL
        ups.test.result: Done and passed/

        So my question becomes, why is this reboot required and it
        doesn't seem to make any sense?  I can't understand why the
        polled data from a UPS would change after a reboot, while on
        the UPS LCD its reporting all OK?  I tried restarting
        NUT-SERVER to see if it would make any difference.  Also, the
        command test.battery.start will clear the alarm on the UPS if
        battery test good.

        The only explanation that I have come up with is that the
        persistent RB/REPLBATT is latched to this condition and is an
        artifact of UPS to NUT handshaking.

        Any feedback would be kindly appreciated, as I have searched
        and searched.

        Thank you!

        Vyasa
        _______________________________________________
        Nut-upsuser mailing list
        Nut-upsuser@alioth-lists.debian.net
        https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser
_______________________________________________
Nut-upsuser mailing list
Nut-upsuser@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser

Reply via email to