date:20150630

[Puppet-dev] Community PR Triage for 2015-06-30

2015-06-30 Thread Josh Cooper

The PR triage for puppet/facter/hiera/puppet-server will be starting at 10:00 
AM PDT today at http://links.puppetlabs.com/pr-triage

Josh

-- 
Josh Cooper
Developer, Puppet Labs

*PuppetConf 2015 http://2015.puppetconf.com/ is coming to Portland, 
Oregon! Join us October 5-9.*
*Register now to take advantage of the Early Adopter discount 
https://www.eventbrite.com/e/puppetconf-2015-october-5-9-tickets-13115894995?discount=EarlyAdopter
 *
*—**save $349!*

-- 
You received this message because you are subscribed to the Google Groups 
Puppet Developers group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/41060ed9-e62b-4faf-a62b-2a069282c7bc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[Puppet-dev] Re: Catalog Deserialization performance

2015-06-30 Thread Romain F.

I've already benchmarked and profiled Catalog's from_data_hash and
to_data_hash methods using the benchmark framework.
Most of the time is spent in the from_data_hash (we already knew it) but
there is no big pitfalls where Ruby loses his time.

My callgrind file shows that the top 5 (in self time) is :
- Array.flatten (55000 calls)
- Array.each (115089 calls)
- Puppet::Resource.initialize (15000 calls)
- String.=~ (65045 calls)
- Hash[]= (115084 calls)

This top 5 is taking ~30% of the total time .

As you can see, it can be dificult to optimize this. IMHO, the benchmark
- tweak - benchmark way of optimizing is not sufficient here. I think
the way it (de)serialize a catalog needs a deep refactor.

Cheers,

Le mardi 30 juin 2015 04:23:42 UTC+2, henrik lindberg a écrit :

On 2015-29-06 22:41, Trevor Vaughan wrote:
If you get a profiling suite together (aka, bunch of random patches)
could you release it?

It is not difficult actually. Look at the benchmarks in the puppet code
base. Many of them are suitable for profiling with a ruby profiler.
I don't think we have any benchmarks targeting the agent side though, so
the first thing to do (for someone) is to write one.

What is more difficult is coming up with a benchmark that does not
involve real/complex resources - but deserialization and up to actually
applying should be possible to work with in a simple way.

Profiling is then just running that benchmark with the ruby profiler
turned on and analyzing the result, make changes, run again... (repeat
until happy).

- henrik

I've been curious about this for quite some time but never quite got
around to dealing with it.

My concern is very much client side performance since the more you
managing a client, the less the client gets to do it's actual job.

Thanks,

Trevor

On Mon, Jun 29, 2015 at 4:35 PM, Henrik Lindberg
henrik@cloudsmith.com javascript: mailto:
henrik@cloudsmith.com javascript:
wrote:

On 2015-29-06 16 tel:2015-29-06%2016:48, Romain F. wrote:

Hi everyone,

I try to optimize our Puppet runs by running some benchmarks and
patching the puppet core (if possible). But I have some
difficulties
around the catalog serialization/deserialization.

In fact, in 3.7.5 or 3.8.x, the Config Retrieval takes roughly
7secs and
only 4 secs is on the master side. Same fact in 4.2 but with 9
secs of
config retrieval and still 4 secs on the master side.

My first thoughts was Okay, time to try MsgPack. No
improvements.

I've instrumented a bit the code in the master branch around
this, and
I've found out that, on my 9secs of config retrieval, 3.61secs
is lost
in catalog deserialization, 2 secs is the catalog conversion..
But it's
not the real deserialization (PSON to Hash) that takes ages,
it's the
creation of the Catalog object itself (Hash to catalog).
Benchmarks
shows that the time to deserialize MsgPack (or PSON) is
negligible
compared to the catalog deserialization time.

So here is my question : Is that a known issue ? Is there any
reason of
the regression in 4.x (Future parser creating more objects, ...)
?

The parser=future setting only makes a difference when compiling the
catalog - the catalog itself does not contain more or different data
(except possibly using numbers instead of strings for some
attributes).

The best way to optimize this is to write a benchmark using the
benchmark framework and measure the time it takes to deserialize a
given catalog. Then run that benchmark with Ruby profiling turned
on.

There are quite a few things going on at the agent side in addition
to taking the catalog PSON and turning it into a catalog that it can
apply (loading types, resolving providers, etc). Make sure to
benchmark these separately if possible.

Regards
- henrik

Cheers,

--
You received this message because you are subscribed to the
Google
Groups Puppet Developers group.
To unsubscribe from this group and stop receiving emails from
it, send
an email to puppet-dev+...@googlegroups.com javascript:
mailto:puppet-dev%2bunsubscr...@googlegroups.com javascript:

mailto:puppet-dev+unsubscr...@googlegroups.com javascript:
mailto:puppet-dev%2bunsubscr...@googlegroups.com javascript:.

To view this discussion on the web visit

https://groups.google.com/d/msgid/puppet-dev/a5bf7422-6119-43ee-ba11-44001c1ce097%40googlegroups.com

[Puppet-dev] Re: Catalog Deserialization performance

2015-06-30 Thread Henrik Lindberg


On 2015-30-06 16:17, Romain F. wrote:

I've already benchmarked and profiled Catalog's from_data_hash and
to_data_hash methods using the benchmark framework.
Most of the time is spent in the from_data_hash (we already knew it) but
there is no big pitfalls where Ruby loses his time.

My callgrind file shows that the top 5 (in self time) is :
- Array.flatten (55000 calls)
- Array..each (115089 calls)
- Puppet::Resource.initialize (15000 calls)
- String.=~ (65045 calls)
- Hash[]= (115084 calls)

This top 5 is taking ~30% of the total time .

As you can see, it can be dificult to optimize this. IMHO, the
benchmark - tweak - benchmark way of optimizing is not sufficient
here. I think the way it (de)serialize a catalog needs a deep refactor.



There is probably lots of duplicated work going on at the levels above 
and those are causing those generic methods to light up (except 
Puppet::Resource.initialize).


There is both the deserialization process as such to optimize, but also 
the Resource implementation itself which is far from optimal.


The next thing would be to focus on Resource.initialize/from_data_hash

I think it is also relevant to establish some kind of world record - 
say serializing and deserializing a hash using MsgPack; a hash of data 
cannot be transported faster across the wire than that (unless also not 
using Ruby objects to represent the data - with a lot of extra complexity).


I mean, a hash of some complexity will always consume quite a bit of 
processing and memory to get across the wire. Is it hitting the world 
record enough?


- henrik


Cheers,

Le mardi 30 juin 2015 04:23:42 UTC+2, henrik lindberg a écrit :

On 2015-29-06 22:41, Trevor Vaughan wrote:
  If you get a profiling suite together (aka, bunch of random patches)
  could you release it?
 

It is not difficult actually. Look at the benchmarks in the puppet code
base. Many of them are suitable for profiling with a ruby profiler.
I don't think we have any benchmarks targeting the agent side
though, so
the first thing to do (for someone) is to write one.

What is more difficult is coming up with a benchmark that does not
involve real/complex resources - but deserialization and up to actually
applying should be possible to work with in a simple way.

Profiling is then just running that benchmark with the ruby profiler
turned on and analyzing the result, make changes, run again... (repeat
until happy).

- henrik


  I've been curious about this for quite some time but never quite got
  around to dealing with it.
 
  My concern is very much client side performance since the more you
  managing a client, the less the client gets to do it's actual job.
 
  Thanks,
 
  Trevor
 
  On Mon, Jun 29, 2015 at 4:35 PM, Henrik Lindberg
  henrik.@cloudsmith.com javascript:
mailto:henrik@cloudsmith.com javascript:
  wrote:
 
  On 2015-29-06 16 tel:2015-29-06%2016:48, Romain F. wrote:
 
  Hi everyone,
 
  I try to optimize our Puppet runs by running some
benchmarks and
  patching the puppet core (if possible).. But I have some
difficulties
  around the catalog serialization/deserialization.
 
  In fact, in 3.7.5 or 3.8.x, the Config Retrieval takes
roughly
  7secs and
  only 4 secs is on the master side. Same fact in 4.2 but
with 9
  secs of
  config retrieval and still 4 secs on the master side.
 
  My first thoughts was Okay, time to try MsgPack. No
improvements.
 
  I've instrumented a bit the code in the master branch around
  this, and
  I've found out that, on my 9secs of config retrieval,
3.61secs
  is lost
  in catalog deserialization, 2 secs is the catalog
conversion..
  But it's
  not the real deserialization (PSON to Hash) that takes
ages,
  it's the
  creation of the Catalog object itself (Hash to catalog).
Benchmarks
  shows that the time to deserialize MsgPack (or PSON) is
negligible
  compared to the catalog deserialization time.
 
  So here is my question : Is that a known issue ? Is there
any
  reason of
  the regression in 4.x (Future parser creating more
objects, ...) ?
 
  The parser=future setting only makes a difference when
compiling the
  catalog - the catalog itself does not contain more or
different data
  (except possibly using numbers instead of strings for some
attributes).
 
  The best way to optimize this is to write a benchmark using the
  benchmark framework and measure the time it takes to
deserialize a
  given catalog. Then run that benchmark

[Puppet-dev] Re: Community PR Triage for 2015-06-30

2015-06-30 Thread Josh Cooper

On Tuesday, June 30, 2015 at 8:19:23 AM UTC-7, Josh Cooper wrote:

 The PR triage for puppet/facter/hiera/puppet-server will be starting at 10:00 
 AM PDT today at http://links.puppetlabs.com/pr-triage


Notes from today's PR triage are posted: 
https://github.com/puppet-community/community-triage/blob/master/core/notes/2015-06-30.md
.

We merged several PRs (including 4 for native facter on openbsd, thanks 
Jasper!) and had good discussions around filebuckets, static compiler, and 
agent-side profiling. We'll be back next week at the usual time.

Josh

-- 
Josh Cooper
Developer, Puppet Labs

*PuppetConf 2015 http://2015.puppetconf.com/ is coming to Portland, 
Oregon! Join us October 5-9.*
*Register now to take advantage of the Early Adopter discount 
https://www.eventbrite.com/e/puppetconf-2015-october-5-9-tickets-13115894995?discount=EarlyAdopter
 *
*—**save $349!* 

-- 
You received this message because you are subscribed to the Google Groups 
Puppet Developers group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/5666d15a-e36c-4d8f-90a7-fb70c0641fc2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[Puppet-dev] Community PR Triage for 2015-06-30

[Puppet-dev] Re: Catalog Deserialization performance

[Puppet-dev] Re: Catalog Deserialization performance

[Puppet-dev] Re: Community PR Triage for 2015-06-30

4 matches

Site Navigation

Mail list logo

Footer information