Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte

2019-03-12 Thread Kienan Stewart
On Tue, Mar 12, 2019 at 10:22:52AM +0200, Apollon Oikonomopoulos wrote:
> Control: tags -1 - moreinfo + patch pending
> Control: severity -1 serious
> 
> OK, thanks for confirming this and thanks for the detailed report, it's 
> very helpful!
> 
> I'm bumping this bug to RC, as I don't think Puppet report storage 
> should break merely by installing an unrelated Ruby package.  
> Additionally this is a regression from Puppet 4.x.
> 
> So, here's what's happening:
> 
>  - Puppet 5 switched from a custom wire format (PSON) to JSON for 
>transmitting facts and reports.
>  - There is no issue when using the ruby-json JSON library to parse 
>reports.
>  - Some JSON libraries (Oj and JrJackson) de-serialize floats with many 
>decimal digits - such as the timing metrics found in a Puppet agent 
>report - as BigDecimal.
>  - When BigDecimal's are serialized again to JSON, they are serialized 
>as Strings (and not floats), causing PuppetDB's schema validation to 
>fail.
>  - This path is only triggered when ruby-oj and ruby-multi-json are 
>installed, enabling Puppet to use Oj via multi-json. (JrJackson is 
>Jruby-only, and there are provisions upstream handling BigDecimal 
>conversion in this case).
> 
> Your patch fixes the issue, but it does so right before the report is 
> transmitted to PuppetDB. I think it's best to instruct Oj to never 
> deserialize floats as BigDecimals and avoid having to do any conversions 
> in the first place. Additionally, this will guard all other report 
> processors (e.g. store) which might want to handle metrics against 
> similar issues.
> 
> Can you test the attached patch and confirm that it works?
>

Hi,

I applied the patch (reloaded apache2 afterwards) and runs are able to store 
reports now.
(I did confirm before the patch was applied that the reports were not being 
stored).

Thanks for the great explanation, your effort and time, and the patch!

Thanks,
Kienan

> Regards,
> Apollon


signature.asc
Description: PGP signature


Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte

2019-03-12 Thread Apollon Oikonomopoulos
Control: tags -1 - moreinfo + patch pending
Control: severity -1 serious

On 13:43 Mon 11 Mar , Kienan Stewart wrote:
> Hi Apollon,
> 
> On Sat, Mar 09, 2019 at 11:24:27PM +0200, Apollon Oikonomopoulos wrote:
> > Control: tags -1 - unreproducible
> > 
> > Hi,
> > 
> > 
> > Thanks for the patch, it should do the trick. However, before applying 
> > it I want to be 100% sure that we know what's happening and why.
> > 
> > So, I managed to reproduce this by simply installing ruby-multi-json,
> > which allows Puppet to use different json backends. Can you verify that 
> > ruby-multi-json is installed, presumably on the master?
> > 
> 
> ruby-multi-json (1.12.1-1) is installed.
> 
> Checking with 'apt-cache policy rdepends ruby-multi-json', it's like because 
> we use r10k

OK, thanks for confirming this and thanks for the detailed report, it's 
very helpful!

I'm bumping this bug to RC, as I don't think Puppet report storage 
should break merely by installing an unrelated Ruby package.  
Additionally this is a regression from Puppet 4.x.

So, here's what's happening:

 - Puppet 5 switched from a custom wire format (PSON) to JSON for 
   transmitting facts and reports.
 - There is no issue when using the ruby-json JSON library to parse 
   reports.
 - Some JSON libraries (Oj and JrJackson) de-serialize floats with many 
   decimal digits - such as the timing metrics found in a Puppet agent 
   report - as BigDecimal.
 - When BigDecimal's are serialized again to JSON, they are serialized 
   as Strings (and not floats), causing PuppetDB's schema validation to 
   fail.
 - This path is only triggered when ruby-oj and ruby-multi-json are 
   installed, enabling Puppet to use Oj via multi-json. (JrJackson is 
   Jruby-only, and there are provisions upstream handling BigDecimal 
   conversion in this case).

Your patch fixes the issue, but it does so right before the report is 
transmitted to PuppetDB. I think it's best to instruct Oj to never 
deserialize floats as BigDecimals and avoid having to do any conversions 
in the first place. Additionally, this will guard all other report 
processors (e.g. store) which might want to handle metrics against 
similar issues.

Can you test the attached patch and confirm that it works?

Regards,
Apollon
From: Apollon Oikonomopoulos 
Date: Sun, 10 Mar 2019 01:22:29 +0200
Subject: Avoid BigDecimals when loading JSON using Oj

This is already done for JrJackson and needs to be done for Oj as well
to avoid sending malformed reports to PuppetDB.

Bug-Debian: https://bugs.debian.org/923976
---
 lib/puppet/util/json.rb | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/puppet/util/json.rb b/lib/puppet/util/json.rb
index 6baea59..3475990 100644
--- a/lib/puppet/util/json.rb
+++ b/lib/puppet/util/json.rb
@@ -32,13 +32,15 @@ module Puppet::Util
 def self.load(string, options = {})
   if defined? MultiJson
 begin
-  # This ensures that JrJackson will parse very large or very small
+  # This ensures that JrJackson and Oj will parse very large or very small
   # numbers as floats rather than BigDecimals, which are serialized as
   # strings by the built-in JSON gem and therefore can cause schema errors,
   # for example, when we are rendering reports to JSON using `to_pson` in
   # PuppetDB.
   if MultiJson.adapter.name == "MultiJson::Adapters::JrJackson"
 options[:use_bigdecimal] = false
+  elsif MultiJson.adapter.name == "MultiJson::Adapters::Oj"
+options[:bigdecimal_load] = :float
   end
 
   MultiJson.load(string, options)


Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte

2019-03-11 Thread Kienan Stewart
Hi Apollon,

On Sat, Mar 09, 2019 at 11:24:27PM +0200, Apollon Oikonomopoulos wrote:
> Control: tags -1 - unreproducible
> 
> Hi,
> 
> 
> Thanks for the patch, it should do the trick. However, before applying 
> it I want to be 100% sure that we know what's happening and why.
> 
> So, I managed to reproduce this by simply installing ruby-multi-json,
> which allows Puppet to use different json backends. Can you verify that 
> ruby-multi-json is installed, presumably on the master?
> 

ruby-multi-json (1.12.1-1) is installed.

Checking with 'apt-cache policy rdepends ruby-multi-json', it's like because we 
use r10k

> Thanks,
> Apollon

Thanks,
Kienan


signature.asc
Description: PGP signature


Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte

2019-03-09 Thread Apollon Oikonomopoulos
Control: tags -1 - unreproducible

Hi,

On 12:26 Fri 08 Mar , Kienan Stewart wrote:
> Hi,
> 
> I have made a patch which seems to resolve the issue. I'm not at all confident
> it's in the right place. The patch applies to the PuppetDB source which is 
> used
> to make the puppet-terminus-puppetdb package.
> 
> diff --git a/puppet/lib/puppet/reports/puppetdb.rb 
> b/puppet/lib/puppet/reports/puppetdb.rb
> index d3f0c07..b78f02c 100644
> --- a/puppet/lib/puppet/reports/puppetdb.rb
> +++ b/puppet/lib/puppet/reports/puppetdb.rb
> @@ -129,7 +129,7 @@ Puppet::Reports.register_report(:puppetdb) do
>  [:puppetdb, :metrics_list, :build]) do
>metrics_list = []
>metrics.each do |name, data|
> -metric_hashes = data.values.map {|x| {"category" => data.name, 
> "name" => x.first, "value" => x.last}}
> +metric_hashes = data.values.map {|x| {"category" => data.name, 
> "name" => x.first, "value" => x.last.to_f}}
>  metrics_list.concat(metric_hashes)
>end
>metrics_list
> 
> I'm not aware if there will be consequences to forcing all the metrics values 
> to
> floats.

Thanks for the patch, it should do the trick. However, before applying 
it I want to be 100% sure that we know what's happening and why.

So, I managed to reproduce this by simply installing ruby-multi-json,
which allows Puppet to use different json backends. Can you verify that 
ruby-multi-json is installed, presumably on the master?

Thanks,
Apollon



Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte

2019-03-08 Thread Kienan Stewart
Hi,

I have made a patch which seems to resolve the issue. I'm not at all confident
it's in the right place. The patch applies to the PuppetDB source which is used
to make the puppet-terminus-puppetdb package.

diff --git a/puppet/lib/puppet/reports/puppetdb.rb 
b/puppet/lib/puppet/reports/puppetdb.rb
index d3f0c07..b78f02c 100644
--- a/puppet/lib/puppet/reports/puppetdb.rb
+++ b/puppet/lib/puppet/reports/puppetdb.rb
@@ -129,7 +129,7 @@ Puppet::Reports.register_report(:puppetdb) do
 [:puppetdb, :metrics_list, :build]) do
   metrics_list = []
   metrics.each do |name, data|
-metric_hashes = data.values.map {|x| {"category" => data.name, "name" 
=> x.first, "value" => x.last}}
+metric_hashes = data.values.map {|x| {"category" => data.name, "name" 
=> x.first, "value" => x.last.to_f}}
 metrics_list.concat(metric_hashes)
   end
   metrics_list

I'm not aware if there will be consequences to forcing all the metrics values to
floats.

Thanks,
Kienan


signature.asc
Description: PGP signature


Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte

2019-03-08 Thread Kienan Stewart
Hi,

I've been digging around a bit more and I think I have an idea to help narrow
the conditions in which I'm seeing this error.

I re-configured a test box to output the report to disk and to PuppetDB so I
was able to compare the data sent.

I see a correlation between time metrics stored a Ruby.BigNumber and all the
failed metrics when trying store the report in PuppetDB. Since the metrics
values with differ from run to run, it could be possible to have runs which
are able to be stored without this specific error.

A summary comparison table is available in the attachment 
reports-errors-summary.tsv.

To produce this I took the error line from the PuppetDB log 
(report-puppetdb_error.log)
where traceback lists details of "value does not match schema"

:cause "Value does not match schema: {:metrics [nil nil nil nil nil nil nil 
nil nil nil nil nil {:value (not (instance? java.lang.Number 
a-java.lang.String))} nil {:value (not (instance? java.lang.Number 
a-java.lang.String))} {:value (not (instance? java.lang.Number 
a-java.lang.String))} nil nil nil {:value (not (instance? java.lang.Number 
a-java.lang.String))} nil {:value (not (instance? java.lang.Number 
a-java.lang.String))} nil nil {:value (not (instance? java.lang.Number 
a-java.lang.String))} nil {:value (not (instance? java.lang.Number 
a-java.lang.String))} nil {:value (not (instance? java.lang.Number 
a-java.lang.String))} {:value (not (instance? java.lang.Number 
a-java.lang.String))} nil nil nil nil {:value (not (instance? java.lang.Number 
a-java.lang.String))} nil nil nil nil {:value (not (instance? java.lang.Number 
a-java.lang.String))} nil nil nil nil nil]}"


Each nil value I assumed to be not an error. I lined these values up in 
appearance
order with the metrics data from the same log. (I won't paste the entire line 
here
since it's quite long).

I then got the corresponding value from the report stored on disk (attached
as report-stored.yaml).

Note: for these reports I was using our puppetmaster manifest, so there are more
resource metrics than on an example with an empty manifest.

Thanks,
Kienan

resourceschanged1nil
resourcescorrective_change1nil 
resourcesfailed0nil 
resourcesfailed_to_restart0nil 
resourcesout_of_sync1nil 
resourcesrestarted0nil 
resourcesscheduled0nil 
resourcesskipped0nil 
resourcestotal715nil 
timealternatives0.023357004nil 
timeanchor8.95E-04nil 
timeaugeas0.089786623nil 
timecatalog_application4.96E+00{:value (not (instance? java.lang.Number a-java.lang.String))}ruby/object:BigDecimal27:0.49594088389994795e1
timeconcat_file0.001357939nil 
timeconcat_fragment5.54E-03{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal36:0.553882705e2
timeconfig_retrieval7.15E+00{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal27:0.7151159435435e1
timeconvert_catalog0.544536797000546nil 
timecron0.014934991nil 
timeexec0.020149716nil 
timefact_generation2.20E+00{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal27:0.21955279620005967e1
timefile1.173581222nil 
timefile_line2.55E-02{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal36:0.255076198e1
timefilebucket1.11E-04nil 
timegroup7.41E-04nil 
timeini_setting1.27E-02{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal36:0.126739428e1
timemailalias5.89E-04nil 
timenode_retrieval6.56E+00{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal27:0.6560879654185e1
timepackage0.009044969nil 
timeplugin_sync1.49E+00{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal27:0.1485196431767e1
timepostgresql_conf3.16E-04{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal36:0.315886003e3
timepostgresql_conn_validator0.054996048nil 
timepostgresql_psql1.220987548nil 
timepuppetdb_conn_validator0.033792702nil 
timeresources6.83E-05nil 
timeservice2.94E-01{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal27:0.2937224029997e0
timessh_authorized_key0.0013181nil 
timesshkey8.07E-04nil 
timetotal22.909333186nil 
timetransaction_evaluation4.70985980499972nil 
timeuser3.47E-02{:value (not (instance? java.lang.Number a-java.lang.String))} ruby/object:BigDecimal36:0.346835945e1
timevcsrepo0.055521215nil 
changestotal1nil 
eventsfailure0nil 
eventssuccess1nil 
eventstotal1nil

2019-03-08T11:03:59.479-05:00 INFO  [p.p.command] [16-1552061039324] [86 ms] 
'replace facts' command processed for pm-buster.test
2019-03-08T11:04:05.158-05:00 INFO  [p.p.command] [17-1552061044601] [178 ms] 
'replace catalog' command processed for pm-buster.test
2019-03-08T11:04:12.519-05:00 ERROR [p.p.command] [18] [store report] Fatal 
error on attempt 0 for pm-buster.test
clojure.lang.ExceptionInfo: throw+: {:fatal true, :cause #error {
 :cause "Value does not 

Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte

2019-03-08 Thread Kienan Stewart
On Fri, Mar 08, 2019 at 04:34:39PM +0200, Apollon Oikonomopoulos wrote:
> Control: tags -1 unreproducible moreinfo
> 
> 
> Turns out this is already enabled and running in our tests without 
> issues. I'm unable to trivially reproduce this, which makes me wonder if 
> it's a matter of the terminus version. Which version of the 
> puppet-terminus-puppetdb package are you using on the master?

puppet-terminus-puppetdb is currently 6.2.0-3

> 
> Regards,
> Apollon


signature.asc
Description: PGP signature


Bug#923976: [Pkg-puppet-devel] Bug#923976: Bug#923976: puppet: Reports submitte

2019-03-08 Thread Apollon Oikonomopoulos
Control: tags -1 unreproducible moreinfo

On 15:47 Fri 08 Mar , Apollon Oikonomopoulos wrote:
> On 14:54 Thu 07 Mar , Kienan wrote:
> > When running puppet 5.x with reports enabled and sent to PuppetDB 6.x,
> > PuppetDB fails to store the report with the following error:
> > 
> > 2019-03-01T17:06:01.396-05:00 ERROR [p.p.command] [100] [store report] 
> > Fatal error on attempt 0 for pm-buster.test
> > clojure.lang.ExceptionInfo: throw+: {:fatal true, :cause #error {
> >  :cause "Value does not match schema: {:metrics [nil nil nil nil nil nil
> >  nil nil nil nil nil nil {:value (not (instance? java.lang.Number
> >  a-java.lang.String))} {:value (not (instance? java.lang.Number
> >  a-java.lang.String))} nil {:value (not (instance? java.lang.Number
> >  a-java.lang.String))} nil {:value (not (instance? java.lang.Number
> >  a-java.lang.String))} nil {:value (not (instance? java.lang.Number
> >  a-java.lang.String))} {:value (not (instance? java.lang.Number
> >  a-java.lang.String))} {:value (not (instance? java.lang.Number
> >  a-java.lang.String))} nil nil {:value (not (instance? java.lang.Number
> >  a-java.lang.String))} nil {:value (not (instance? java.lang.Number
> >  a-java.lang.String))} nil {:value (not (instance? java.lang.Number
> >  a-java.lang.String))} nil nil nil nil nil nil nil nil nil {:value (not
> >  (instance? java.lang.Number a-java.lang.String))} nil nil nil nil nil
> >  nil]}"
> >  
> > A copy of the Puppet DB log file is attached with the complete
> > backtrace.
> > 
> > I'm filing this bug against the puppet package for the following reasons
> > 
> > * hosts running puppet 4.x (stretch) are able to submit reports to same
> > version of Puppet DB.
> 
> So, this sounds like a regression. I would not be surprised if Puppet 4 
> nodes failed to work with PuppetDB 6, but this is clearly a bug.
> 
> > * I am able to "solve" the problem by modifying a number of places in
> > the puppet agent files where report information is returned and calling
> > "to_i" or "to_f" to force casting into a numeric value instead of a
> > string.
> > 
> > The metrics keys affected include: convert_catalog and
> > transaction_evaluation. The keys affected may change depending on the
> > manifest used for the node.
> > 
> > This being said, it may be appropriate to move this bug to the PuppetDB
> > package. I haven't been able to find any upstream bugs regarding this
> > behaviour.
> > 
> > To reproduce:
> > 
> > 1. Install puppet, puppet-master-passenger, and puppetdb on the same
> > host.
> > 2. Configure the puppet master to storeconfigs with the PuppetDB and the
> > agent to submit reports:
> > 
> > (snippet from /etc/puppet.conf)
> > 
> >   [main]
> > reports = puppetdb
> 
> Note to self: let's add report storage to the puppet/puppetdb 
> autopkgtests.

Turns out this is already enabled and running in our tests without 
issues. I'm unable to trivially reproduce this, which makes me wonder if 
it's a matter of the terminus version. Which version of the 
puppet-terminus-puppetdb package are you using on the master?

Regards,
Apollon