Bug#1064765: prometheus: FTBFS: dh_auto_test error
Version: 2.45.3+ds-3 El 28/3/24 a las 23:38, Daniel Swarbrick escribió: On 28.03.24 23:33, Santiago Vila wrote: If you prefer I could report this build failure in a new report (or you can also use the clone command so that the bug has a new number, then close the old bug). Please report a new bug, with just the relevant info regarding the new build failure. Ok, will do. Closing this one, then. We already override the default test timeout for arm, mips64el and riscv64 to 60 minutes, as well as set "-short", because otherwise those archs simply take too long to grind through all the tests. If you expect these tests to pass on a host with only one or two cores, we will certainly need to raise the test timeout, even for fast amd64 hosts. Yes, I expect it to work on all systems on which prometheus itself works. (And as a prometheus user myself, I've used it in all sort of systems, big or small). Thanks.
Bug#1064765: prometheus: FTBFS: dh_auto_test error
On 28.03.24 23:33, Santiago Vila wrote: If you prefer I could report this build failure in a new report (or you can also use the clone command so that the bug has a new number, then close the old bug). Please report a new bug, with just the relevant info regarding the new build failure. We already override the default test timeout for arm, mips64el and riscv64 to 60 minutes, as well as set "-short", because otherwise those archs simply take too long to grind through all the tests. If you expect these tests to pass on a host with only one or two cores, we will certainly need to raise the test timeout, even for fast amd64 hosts. OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1064765: prometheus: FTBFS: dh_auto_test error
El 28/3/24 a las 22:42, Daniel Swarbrick escribió: I think you are taking the "FAILED" out of context and misinterpreting the test output. This is very likely, yes, and I'm sorry for that. If you prefer I could report this build failure in a new report (or you can also use the clone command so that the bug has a new number, then close the old bug). Thanks.
Bug#1064765: prometheus: FTBFS: dh_auto_test error
As expected: === RUN TestQuerierIndexQueriesRace/[m!="0"___name__="metric"] panic: test timed out after 20m0s ... FAILgithub.com/prometheus/prometheus/tsdb 1200.016s On 28.03.24 23:13, Santiago Vila wrote: Ok, I'm attaching one of my build logs for version 2.45.3+ds-3. This one was tried on a m6a.large instance from AWS, which has 2 CPUs. Thanks. OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1064765: prometheus: FTBFS: dh_auto_test error
El 28/3/24 a las 22:42, Daniel Swarbrick escribió: Please can you find in your logs the _actual_ failing test or tests, because it is not TestRulesUnitTest. Ok, I'm attaching one of my build logs for version 2.45.3+ds-3. This one was tried on a m6a.large instance from AWS, which has 2 CPUs. Thanks. prometheus_2.45.3+ds-3_amd64-20240325T141933.921Z.gz Description: application/gzip
Bug#1064765: prometheus: FTBFS: dh_auto_test error
On 28.03.24 15:00, Santiago Vila wrote: In either case, this is still happening for me in the current version: Lucas Nussbaum wrote: FAILED: 1:48: parse error: unexpected character inside braces: '0' I think you are taking the "FAILED" out of context and misinterpreting the test output. Those are TestRulesUnitTest/* subtests, which are expected to fail. The summary at the end shows the expected results: === RUN TestRulesUnitTest ... === RUN TestRulesUnitTest/Bad_input_series Unit Testing: ./testdata/bad-input-series.yml FAILED: 1:48: parse error: unexpected character inside braces: '0' ... --- PASS: TestRulesUnitTest (0.38s) --- PASS: TestRulesUnitTest/Passing_Unit_Tests (0.22s) --- PASS: TestRulesUnitTest/Long_evaluation_interval (0.13s) --- PASS: TestRulesUnitTest/Bad_input_series (0.00s) --- PASS: TestRulesUnitTest/Bad_PromQL (0.00s) --- PASS: TestRulesUnitTest/Bad_rules_(syntax_error) (0.00s) --- PASS: TestRulesUnitTest/Bad_rules_(error_evaluating) (0.00s) --- PASS: TestRulesUnitTest/Simple_failing_test (0.01s) --- PASS: TestRulesUnitTest/Disabled_feature_(@_modifier) (0.00s) --- PASS: TestRulesUnitTest/Enabled_feature_(@_modifier) (0.00s) --- PASS: TestRulesUnitTest/Disabled_feature_(negative_offset) (0.00s) --- PASS: TestRulesUnitTest/Enabled_feature_(negative_offset) (0.00s) You will see this in the output of _passing_ debci test runs. Please can you find in your logs the _actual_ failing test or tests, because it is not TestRulesUnitTest. If you are running tests on a VM with a single core, it's quite likely that you're hitting the test timeout, which I would find a more reasonable explanation for the dh_auto_test_error, since the Prometheus tests are quite extensive. They will take more than an hour on an 11th gen Intel Core i7 if I set GOMAXPROCS=1. Since debian/rules is setting a test timeout of 20m by default, this would fail. OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1064765: prometheus: FTBFS: dh_auto_test error
On 28.03.24 15:00, Santiago Vila wrote: Daniel Swarbrick wrote: * Add new 0022-Support-prometheus-common-0.47.0.patch (Closes: #1064765) Hello. I don't quite understand how the above fix is related to the bug itself (but maybe it's because I don't know prometheus internals). As described in the patch: This cherry-picks part of a commit relating to negotiation of the "Accept" header, which became more complex with prometheus/common 0.47.0. See upstream commit a28d786. This resolved the original FTBFS for which this bug was opened, as far as I could see, which was this test failure: > === RUN TestFederationWithNativeHistograms > federate_test.go:417: > Error Trace: /<>/.build/src/github.com/prometheus/prometheus/web/federate_test.go:417 >Error: Not equal: >expected: 4 >actual : 1 >Test: TestFederationWithNativeHistograms > --- FAIL: TestFederationWithNativeHistograms (0.01s) I was able to reliably reproduce that failure by rolling forward / back the prometheus/common dependency in go.mod on a local git clone. In either case, this is still happening for me in the current version: Lucas Nussbaum wrote: FAILED: 1:48: parse error: unexpected character inside braces: '0' This sounds like a _new_ bug. Note: I'm currently using virtual machines with 1 CPU and with 2 CPUs for archive rebuilds. On systems with 2 CPUs, the package FTBFS randomly. On systems with 1 CPU, the package FTBFS always. Therefore, to reproduce, please try GRUB_CMDLINE_LINUX="nr_cpus=1" in /etc/default/grub first. I'm struggling to see how a different number of CPU cores would elicit the aforementioned new bug. It doesn't seem to have the typical characteristics of a race condition. I'll have to try to find some time to setup a VM and try to reproduce it. OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1064765: prometheus: FTBFS: dh_auto_test error
found 1064765 2.45.3+ds-3 thanks Daniel Swarbrick wrote: * Add new 0022-Support-prometheus-common-0.47.0.patch (Closes: #1064765) Hello. I don't quite understand how the above fix is related to the bug itself (but maybe it's because I don't know prometheus internals). In either case, this is still happening for me in the current version: Lucas Nussbaum wrote: FAILED: 1:48: parse error: unexpected character inside braces: '0' Note: I'm currently using virtual machines with 1 CPU and with 2 CPUs for archive rebuilds. On systems with 2 CPUs, the package FTBFS randomly. On systems with 1 CPU, the package FTBFS always. Therefore, to reproduce, please try GRUB_CMDLINE_LINUX="nr_cpus=1" in /etc/default/grub first. If that's still not enough to reproduce the build failure, please contact me privately and I will gladly provide a machine to reproduce it. Thanks.
Bug#1064765: prometheus: FTBFS: dh_auto_test error
It appears that bumping prometheus/common to 0.47.0 in the prometheus go.mod will reproduce the test failure. prometheus/common 0.46.0 and earlier does not provoke the test failure. OpenPGP_signature.asc Description: OpenPGP digital signature