Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte
On Tue, Mar 12, 2019 at 10:22:52AM +0200, Apollon Oikonomopoulos wrote: > Control: tags -1 - moreinfo + patch pending > Control: severity -1 serious > > OK, thanks for confirming this and thanks for the detailed report, it's > very helpful! > > I'm bumping this bug to RC, as I don't think Puppet report storage > should break merely by installing an unrelated Ruby package. > Additionally this is a regression from Puppet 4.x. > > So, here's what's happening: > > - Puppet 5 switched from a custom wire format (PSON) to JSON for >transmitting facts and reports. > - There is no issue when using the ruby-json JSON library to parse >reports. > - Some JSON libraries (Oj and JrJackson) de-serialize floats with many >decimal digits - such as the timing metrics found in a Puppet agent >report - as BigDecimal. > - When BigDecimal's are serialized again to JSON, they are serialized >as Strings (and not floats), causing PuppetDB's schema validation to >fail. > - This path is only triggered when ruby-oj and ruby-multi-json are >installed, enabling Puppet to use Oj via multi-json. (JrJackson is >Jruby-only, and there are provisions upstream handling BigDecimal >conversion in this case). > > Your patch fixes the issue, but it does so right before the report is > transmitted to PuppetDB. I think it's best to instruct Oj to never > deserialize floats as BigDecimals and avoid having to do any conversions > in the first place. Additionally, this will guard all other report > processors (e.g. store) which might want to handle metrics against > similar issues. > > Can you test the attached patch and confirm that it works? > Hi, I applied the patch (reloaded apache2 afterwards) and runs are able to store reports now. (I did confirm before the patch was applied that the reports were not being stored). Thanks for the great explanation, your effort and time, and the patch! Thanks, Kienan > Regards, > Apollon signature.asc Description: PGP signature
Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte
Control: tags -1 - moreinfo + patch pending Control: severity -1 serious On 13:43 Mon 11 Mar , Kienan Stewart wrote: > Hi Apollon, > > On Sat, Mar 09, 2019 at 11:24:27PM +0200, Apollon Oikonomopoulos wrote: > > Control: tags -1 - unreproducible > > > > Hi, > > > > > > Thanks for the patch, it should do the trick. However, before applying > > it I want to be 100% sure that we know what's happening and why. > > > > So, I managed to reproduce this by simply installing ruby-multi-json, > > which allows Puppet to use different json backends. Can you verify that > > ruby-multi-json is installed, presumably on the master? > > > > ruby-multi-json (1.12.1-1) is installed. > > Checking with 'apt-cache policy rdepends ruby-multi-json', it's like because > we use r10k OK, thanks for confirming this and thanks for the detailed report, it's very helpful! I'm bumping this bug to RC, as I don't think Puppet report storage should break merely by installing an unrelated Ruby package. Additionally this is a regression from Puppet 4.x. So, here's what's happening: - Puppet 5 switched from a custom wire format (PSON) to JSON for transmitting facts and reports. - There is no issue when using the ruby-json JSON library to parse reports. - Some JSON libraries (Oj and JrJackson) de-serialize floats with many decimal digits - such as the timing metrics found in a Puppet agent report - as BigDecimal. - When BigDecimal's are serialized again to JSON, they are serialized as Strings (and not floats), causing PuppetDB's schema validation to fail. - This path is only triggered when ruby-oj and ruby-multi-json are installed, enabling Puppet to use Oj via multi-json. (JrJackson is Jruby-only, and there are provisions upstream handling BigDecimal conversion in this case). Your patch fixes the issue, but it does so right before the report is transmitted to PuppetDB. I think it's best to instruct Oj to never deserialize floats as BigDecimals and avoid having to do any conversions in the first place. Additionally, this will guard all other report processors (e.g. store) which might want to handle metrics against similar issues. Can you test the attached patch and confirm that it works? Regards, Apollon From: Apollon Oikonomopoulos Date: Sun, 10 Mar 2019 01:22:29 +0200 Subject: Avoid BigDecimals when loading JSON using Oj This is already done for JrJackson and needs to be done for Oj as well to avoid sending malformed reports to PuppetDB. Bug-Debian: https://bugs.debian.org/923976 --- lib/puppet/util/json.rb | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/puppet/util/json.rb b/lib/puppet/util/json.rb index 6baea59..3475990 100644 --- a/lib/puppet/util/json.rb +++ b/lib/puppet/util/json.rb @@ -32,13 +32,15 @@ module Puppet::Util def self.load(string, options = {}) if defined? MultiJson begin - # This ensures that JrJackson will parse very large or very small + # This ensures that JrJackson and Oj will parse very large or very small # numbers as floats rather than BigDecimals, which are serialized as # strings by the built-in JSON gem and therefore can cause schema errors, # for example, when we are rendering reports to JSON using `to_pson` in # PuppetDB. if MultiJson.adapter.name == "MultiJson::Adapters::JrJackson" options[:use_bigdecimal] = false + elsif MultiJson.adapter.name == "MultiJson::Adapters::Oj" +options[:bigdecimal_load] = :float end MultiJson.load(string, options)
Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte
Hi Apollon, On Sat, Mar 09, 2019 at 11:24:27PM +0200, Apollon Oikonomopoulos wrote: > Control: tags -1 - unreproducible > > Hi, > > > Thanks for the patch, it should do the trick. However, before applying > it I want to be 100% sure that we know what's happening and why. > > So, I managed to reproduce this by simply installing ruby-multi-json, > which allows Puppet to use different json backends. Can you verify that > ruby-multi-json is installed, presumably on the master? > ruby-multi-json (1.12.1-1) is installed. Checking with 'apt-cache policy rdepends ruby-multi-json', it's like because we use r10k > Thanks, > Apollon Thanks, Kienan signature.asc Description: PGP signature
Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte
Control: tags -1 - unreproducible Hi, On 12:26 Fri 08 Mar , Kienan Stewart wrote: > Hi, > > I have made a patch which seems to resolve the issue. I'm not at all confident > it's in the right place. The patch applies to the PuppetDB source which is > used > to make the puppet-terminus-puppetdb package. > > diff --git a/puppet/lib/puppet/reports/puppetdb.rb > b/puppet/lib/puppet/reports/puppetdb.rb > index d3f0c07..b78f02c 100644 > --- a/puppet/lib/puppet/reports/puppetdb.rb > +++ b/puppet/lib/puppet/reports/puppetdb.rb > @@ -129,7 +129,7 @@ Puppet::Reports.register_report(:puppetdb) do > [:puppetdb, :metrics_list, :build]) do >metrics_list = [] >metrics.each do |name, data| > -metric_hashes = data.values.map {|x| {"category" => data.name, > "name" => x.first, "value" => x.last}} > +metric_hashes = data.values.map {|x| {"category" => data.name, > "name" => x.first, "value" => x.last.to_f}} > metrics_list.concat(metric_hashes) >end >metrics_list > > I'm not aware if there will be consequences to forcing all the metrics values > to > floats. Thanks for the patch, it should do the trick. However, before applying it I want to be 100% sure that we know what's happening and why. So, I managed to reproduce this by simply installing ruby-multi-json, which allows Puppet to use different json backends. Can you verify that ruby-multi-json is installed, presumably on the master? Thanks, Apollon
Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte
Hi, I have made a patch which seems to resolve the issue. I'm not at all confident it's in the right place. The patch applies to the PuppetDB source which is used to make the puppet-terminus-puppetdb package. diff --git a/puppet/lib/puppet/reports/puppetdb.rb b/puppet/lib/puppet/reports/puppetdb.rb index d3f0c07..b78f02c 100644 --- a/puppet/lib/puppet/reports/puppetdb.rb +++ b/puppet/lib/puppet/reports/puppetdb.rb @@ -129,7 +129,7 @@ Puppet::Reports.register_report(:puppetdb) do [:puppetdb, :metrics_list, :build]) do metrics_list = [] metrics.each do |name, data| -metric_hashes = data.values.map {|x| {"category" => data.name, "name" => x.first, "value" => x.last}} +metric_hashes = data.values.map {|x| {"category" => data.name, "name" => x.first, "value" => x.last.to_f}} metrics_list.concat(metric_hashes) end metrics_list I'm not aware if there will be consequences to forcing all the metrics values to floats. Thanks, Kienan signature.asc Description: PGP signature
Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte
Hi, I've been digging around a bit more and I think I have an idea to help narrow the conditions in which I'm seeing this error. I re-configured a test box to output the report to disk and to PuppetDB so I was able to compare the data sent. I see a correlation between time metrics stored a Ruby.BigNumber and all the failed metrics when trying store the report in PuppetDB. Since the metrics values with differ from run to run, it could be possible to have runs which are able to be stored without this specific error. A summary comparison table is available in the attachment reports-errors-summary.tsv. To produce this I took the error line from the PuppetDB log (report-puppetdb_error.log) where traceback lists details of "value does not match schema" :cause "Value does not match schema: {:metrics [nil nil nil nil nil nil nil nil nil nil nil nil {:value (not (instance? java.lang.Number a-java.lang.String))} nil {:value (not (instance? java.lang.Number a-java.lang.String))} {:value (not (instance? java.lang.Number a-java.lang.String))} nil nil nil {:value (not (instance? java.lang.Number a-java.lang.String))} nil {:value (not (instance? java.lang.Number a-java.lang.String))} nil nil {:value (not (instance? java.lang.Number a-java.lang.String))} nil {:value (not (instance? java.lang.Number a-java.lang.String))} nil {:value (not (instance? java.lang.Number a-java.lang.String))} {:value (not (instance? java.lang.Number a-java.lang.String))} nil nil nil nil {:value (not (instance? java.lang.Number a-java.lang.String))} nil nil nil nil {:value (not (instance? java.lang.Number a-java.lang.String))} nil nil nil nil nil]}" Each nil value I assumed to be not an error. I lined these values up in appearance order with the metrics data from the same log. (I won't paste the entire line here since it's quite long). I then got the corresponding value from the report stored on disk (attached as report-stored.yaml). Note: for these reports I was using our puppetmaster manifest, so there are more resource metrics than on an example with an empty manifest. Thanks, Kienan resourceschanged1nil resourcescorrective_change1nil resourcesfailed0nil resourcesfailed_to_restart0nil resourcesout_of_sync1nil resourcesrestarted0nil resourcesscheduled0nil resourcesskipped0nil resourcestotal715nil timealternatives0.023357004nil timeanchor8.95E-04nil timeaugeas0.089786623nil timecatalog_application4.96E+00{:value (not (instance? java.lang.Number a-java.lang.String))}ruby/object:BigDecimal27:0.49594088389994795e1 timeconcat_file0.001357939nil timeconcat_fragment5.54E-03{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal36:0.553882705e2 timeconfig_retrieval7.15E+00{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal27:0.7151159435435e1 timeconvert_catalog0.544536797000546nil timecron0.014934991nil timeexec0.020149716nil timefact_generation2.20E+00{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal27:0.21955279620005967e1 timefile1.173581222nil timefile_line2.55E-02{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal36:0.255076198e1 timefilebucket1.11E-04nil timegroup7.41E-04nil timeini_setting1.27E-02{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal36:0.126739428e1 timemailalias5.89E-04nil timenode_retrieval6.56E+00{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal27:0.6560879654185e1 timepackage0.009044969nil timeplugin_sync1.49E+00{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal27:0.1485196431767e1 timepostgresql_conf3.16E-04{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal36:0.315886003e3 timepostgresql_conn_validator0.054996048nil timepostgresql_psql1.220987548nil timepuppetdb_conn_validator0.033792702nil timeresources6.83E-05nil timeservice2.94E-01{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal27:0.2937224029997e0 timessh_authorized_key0.0013181nil timesshkey8.07E-04nil timetotal22.909333186nil timetransaction_evaluation4.70985980499972nil timeuser3.47E-02{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal36:0.346835945e1 timevcsrepo0.055521215nil changestotal1nil eventsfailure0nil eventssuccess1nil eventstotal1nil 2019-03-08T11:03:59.479-05:00 INFO [p.p.command] [16-1552061039324] [86 ms] 'replace facts' command processed for pm-buster.test 2019-03-08T11:04:05.158-05:00 INFO [p.p.command] [17-1552061044601] [178 ms] 'replace catalog' command processed for pm-buster.test 2019-03-08T11:04:12.519-05:00 ERROR [p.p.command] [18] [store report] Fatal error on attempt 0 for pm-buster.test clojure.lang.ExceptionInfo: throw+: {:fatal true, :cause #error { :cause "Value does not
Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte
On Fri, Mar 08, 2019 at 04:34:39PM +0200, Apollon Oikonomopoulos wrote: > Control: tags -1 unreproducible moreinfo > > > Turns out this is already enabled and running in our tests without > issues. I'm unable to trivially reproduce this, which makes me wonder if > it's a matter of the terminus version. Which version of the > puppet-terminus-puppetdb package are you using on the master? puppet-terminus-puppetdb is currently 6.2.0-3 > > Regards, > Apollon signature.asc Description: PGP signature
Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte
Control: tags -1 unreproducible moreinfo On 15:47 Fri 08 Mar , Apollon Oikonomopoulos wrote: > On 14:54 Thu 07 Mar , Kienan wrote: > > When running puppet 5.x with reports enabled and sent to PuppetDB 6.x, > > PuppetDB fails to store the report with the following error: > > > > 2019-03-01T17:06:01.396-05:00 ERROR [p.p.command] [100] [store report] > > Fatal error on attempt 0 for pm-buster.test > > clojure.lang.ExceptionInfo: throw+: {:fatal true, :cause #error { > > :cause "Value does not match schema: {:metrics [nil nil nil nil nil nil > > nil nil nil nil nil nil {:value (not (instance? java.lang.Number > > a-java.lang.String))} {:value (not (instance? java.lang.Number > > a-java.lang.String))} nil {:value (not (instance? java.lang.Number > > a-java.lang.String))} nil {:value (not (instance? java.lang.Number > > a-java.lang.String))} nil {:value (not (instance? java.lang.Number > > a-java.lang.String))} {:value (not (instance? java.lang.Number > > a-java.lang.String))} {:value (not (instance? java.lang.Number > > a-java.lang.String))} nil nil {:value (not (instance? java.lang.Number > > a-java.lang.String))} nil {:value (not (instance? java.lang.Number > > a-java.lang.String))} nil {:value (not (instance? java.lang.Number > > a-java.lang.String))} nil nil nil nil nil nil nil nil nil {:value (not > > (instance? java.lang.Number a-java.lang.String))} nil nil nil nil nil > > nil]}" > > > > A copy of the Puppet DB log file is attached with the complete > > backtrace. > > > > I'm filing this bug against the puppet package for the following reasons > > > > * hosts running puppet 4.x (stretch) are able to submit reports to same > > version of Puppet DB. > > So, this sounds like a regression. I would not be surprised if Puppet 4 > nodes failed to work with PuppetDB 6, but this is clearly a bug. > > > * I am able to "solve" the problem by modifying a number of places in > > the puppet agent files where report information is returned and calling > > "to_i" or "to_f" to force casting into a numeric value instead of a > > string. > > > > The metrics keys affected include: convert_catalog and > > transaction_evaluation. The keys affected may change depending on the > > manifest used for the node. > > > > This being said, it may be appropriate to move this bug to the PuppetDB > > package. I haven't been able to find any upstream bugs regarding this > > behaviour. > > > > To reproduce: > > > > 1. Install puppet, puppet-master-passenger, and puppetdb on the same > > host. > > 2. Configure the puppet master to storeconfigs with the PuppetDB and the > > agent to submit reports: > > > > (snippet from /etc/puppet.conf) > > > > [main] > > reports = puppetdb > > Note to self: let's add report storage to the puppet/puppetdb > autopkgtests. Turns out this is already enabled and running in our tests without issues. I'm unable to trivially reproduce this, which makes me wonder if it's a matter of the terminus version. Which version of the puppet-terminus-puppetdb package are you using on the master? Regards, Apollon