Thanks for the positive feedback Andy!

I'm wondering if there would be a way of saying "all of these installations
> are for the same 'site'". That would remove a module looking popular simply
> because it is installed a lot, but only by two or three groups. Maybe that
> information is valuable, maybe not...I'm not sure yet.
>

One of the common practices when building a system such as this is keeping
the people who send you data anonymous. That makes filtering on user hard.
We could potentially deal with that in two ways I can think of. We could
allow users to set a anonymous=false flag in their json blob they deliver,
or we could hash the source ip address and keep that around.

I think the way I intended for it to be used was for users doing CI was to
report that CI in the purpose field. That way we could see total
deployments, but also per-usage deployments. I'm not sure the users would
be willing to differentiate how they run a script between production and CI
though, since the goal of CI is to test it as close to prod as you can.



I'm personally a little cautious about making a deploy process depend on
> external services, but this could be fired off as a background job and it
> doesn't really matter too much if it works or not.
>

I agree that it is a big pill to swallow.  This will likely change, but
right now every deploy must be reported in a single curl request, no bulk
updates. It is also not possible to 'back-fill' data. So deploys are
recorded when they are submitted to puppet-analytics. I could see deploys
for the day being written to a file or database on the users systems, then
a nightly job running to fill in the days deploys on puppet-analytics, but
it would require some changes to the code.

I weighed the balance of allowing arbitrary date insertion. I'm happy to be
convinced otherwise but I think the problems of figuring out when a deploy
occurred when reported by a global system with timezones and all that is
very hard to get right.

Thanks again,
Spencer



On Mon, Sep 8, 2014 at 11:21 AM, Andy Parker <[email protected]> wrote:

> On Sun, Sep 7, 2014 at 3:57 PM, Spencer Krum <[email protected]>
> wrote:
>
>> Hi Puppet-dev,
>>
>> I've been working, with a lot of help from some others, on a new project
>> at http://puppet-analytics.org. It is very much in the
>> experimental/development phase and I'm looking for feedback and help.
>>
>> The goal of this project is to enable module authors and users greater
>> visibility into module use. The architecture is modeled after Debian's
>> popularity contest, where a program on the debian system reports to a
>> central server about package use. This means that Puppet users can
>> submit(through a json/http endpoint) 'hey I've deployed this version of
>> stdlib!'. After a bunch of users have been reporting for a while, module
>> maintainers can see the trends, identify which versions of the modules are
>> being used, etc. Similarly users can see which modules are the most
>> popular, which versions of those modules are the most popular, etc.
>>
>>
> This all looks awesome!
>
>
>> There is an arbitrary tagging system built in that allows users to report
>> that the deploy is being performed by their ci infrastructure, by a
>> developer doing testing, or by an operator pushing code to production. This
>> allows people viewing the data to see the 'true' numbers, unpolluted by ci
>> systems or runaway webcrawlers.
>>
>>
> I'm wondering if there would be a way of saying "all of these
> installations are for the same 'site'". That would remove a module looking
> popular simply because it is installed a lot, but only by two or three
> groups. Maybe that information is valuable, maybe not...I'm not sure yet.
>
>
>> Reporting can be done with curl, or with a script. Right now there is a
>> script and example curl to report to puppet analytics at:
>> https://github.com/nibalizer/puppet-analytics-client. I think everyone's
>> infrastructure looks a little different, so writing a generic tool to
>> report to PA would be pretty hard. I'd like puppet-analytics-client to
>> become a place to put scripts and tools to hit PA.
>>
>> I'm interested in your thoughts an opinions. Especially around the opt-in
>> architecture. Would you be willing to report to PA? Do you think we would
>> ever be able to get enough people reporting that the data would be
>> significant? All the code is open source on github (
>> https://github.com/nibalizer/puppet-analytics). The website is hosted on
>> digital ocean. I also have the mental model that people would report after
>> every code change to their Puppet infrastructure, i.e. in the post-commit
>> hook if using dynamic environments. Is this a model you agree with? Do you
>> have a different idea?
>>
>>
> I think that is a great thing to shoot for. I'm personally a little
> cautious about making a deploy process depend on external services, but
> this could be fired off as a background job and it doesn't really matter
> too much if it works or not.
>
>
>> We have had a lot of conversations, on this list, and in person, around
>> 'what are people doing with puppet?' I think a tool like this could really
>> help us figure out which modules are being used the most often.
>>
>>
> Currently I answer this by trawling through a dump of the forge that we
> have available internally. However, my questions often revolve around how
> people are using the language rather than what modules are in use. That
> said, knowing which modules are heavily used would help everyone to
> understand a lot more.
>
>
>> Please note that PA is not nearly done yet. Much of the empty space I
>> expect will be filled in with cool visualizations of the data. It is liable
>> to break at any time, especially with actual users. One of the cool
>> features that is currently in PR is the ability to have shields.io
>> downloads tags come from PA and show up in the ReadMe's of our modules.
>>
>> Thanks everybody,
>> Spencer
>>
>> --
>> Spencer Krum
>> (619)-980-7820
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Puppet Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/puppet-dev/CADt6FWPoK7N6pwPj4h6_84p-6WEwtz3N6zJbuJniRkHaMi9HBA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/puppet-dev/CADt6FWPoK7N6pwPj4h6_84p-6WEwtz3N6zJbuJniRkHaMi9HBA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Andrew Parker
> [email protected]
> Freenode: zaphod42
> Twitter: @aparker42
> Software Developer
>
> *Join us at PuppetConf 2014 <http://www.puppetconf.com/>, September
> 22-24 in San Francisco*
> *Register by May 30th to take advantage of the Early Adopter discount
> <http://links.puppetlabs.com/puppetconf-early-adopter> **—**save $349!*
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/puppet-dev/CANhgQXtn%2B2FT%3DVtxhUpYUpTv0ea1Be2L613MSHHROMeRd1jxQQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/puppet-dev/CANhgQXtn%2B2FT%3DVtxhUpYUpTv0ea1Be2L613MSHHROMeRd1jxQQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Spencer Krum
(619)-980-7820

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/CADt6FWO%3DG_r79aMRSM8H2DUZ3XJo2a1rZY3z00ust1v%3DvjUBaA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to