Thanks for the positive feedback Andy!

I'm wondering if there would be a way of saying "all of these installations
> are for the same 'site'". That would remove a module looking popular simply
> because it is installed a lot, but only by two or three groups. Maybe that
> information is valuable, maybe not...I'm not sure yet.
>

One of the common practices when building a system such as this is keeping
the people who send you data anonymous. That makes filtering on user hard.
We could potentially deal with that in two ways I can think of. We could
allow users to set a anonymous=false flag in their json blob they deliver,
or we could hash the source ip address and keep that around.

I think the way I intended for it to be used was for users doing CI was to
report that CI in the purpose field. That way we could see total
deployments, but also per-usage deployments. I'm not sure the users would
be willing to differentiate how they run a script between production and CI
though, since the goal of CI is to test it as close to prod as you can.



I'm personally a little cautious about making a deploy process depend on
> external services, but this could be fired off as a background job and it
> doesn't really matter too much if it works or not.
>

I agree that it is a big pill to swallow.  This will likely change, but
right now every deploy must be reported in a single curl request, no bulk
updates. It is also not possible to 'back-fill' data. So deploys are
recorded when they are submitted to puppet-analytics. I could see deploys
for the day being written to a file or database on the users systems, then
a nightly job running to fill in the days deploys on puppet-analytics, but
it would require some changes to the code.

I weighed the balance of allowing arbitrary date insertion. I'm happy to be
convinced otherwise but I think the problems of figuring out when a deploy
occurred when reported by a global system with timezones and all that is
very hard to get right.

Thanks again,
Spencer



On Mon, Sep 8, 2014 at 11:21 AM, Andy Parker <a...@puppetlabs.com> wrote:

> On Sun, Sep 7, 2014 at 3:57 PM, Spencer Krum <krum.spen...@gmail.com>
> wrote:
>
>> Hi Puppet-dev,
>>
>> I've been working, with a lot of help from some others, on a new project
>> at http://puppet-analytics.org. It is very much in the
>> experimental/development phase and I'm looking for feedback and help.
>>
>> The goal of this project is to enable module authors and users greater
>> visibility into module use. The architecture is modeled after Debian's
>> popularity contest, where a program on the debian system reports to a
>> central server about package use. This means that Puppet users can
>> submit(through a json/http endpoint) 'hey I've deployed this version of
>> stdlib!'. After a bunch of users have been reporting for a while, module
>> maintainers can see the trends, identify which versions of the modules are
>> being used, etc. Similarly users can see which modules are the most
>> popular, which versions of those modules are the most popular, etc.
>>
>>
> This all looks awesome!
>
>
>> There is an arbitrary tagging system built in that allows users to report
>> that the deploy is being performed by their ci infrastructure, by a
>> developer doing testing, or by an operator pushing code to production. This
>> allows people viewing the data to see the 'true' numbers, unpolluted by ci
>> systems or runaway webcrawlers.
>>
>>
> I'm wondering if there would be a way of saying "all of these
> installations are for the same 'site'". That would remove a module looking
> popular simply because it is installed a lot, but only by two or three
> groups. Maybe that information is valuable, maybe not...I'm not sure yet.
>
>
>> Reporting can be done with curl, or with a script. Right now there is a
>> script and example curl to report to puppet analytics at:
>> https://github.com/nibalizer/puppet-analytics-client. I think everyone's
>> infrastructure looks a little different, so writing a generic tool to
>> report to PA would be pretty hard. I'd like puppet-analytics-client to
>> become a place to put scripts and tools to hit PA.
>>
>> I'm interested in your thoughts an opinions. Especially around the opt-in
>> architecture. Would you be willing to report to PA? Do you think we would
>> ever be able to get enough people reporting that the data would be
>> significant? All the code is open source on github (
>> https://github.com/nibalizer/puppet-analytics). The website is hosted on
>> digital ocean. I also have the mental model that people would report after
>> every code change to their Puppet infrastructure, i.e. in the post-commit
>> hook if using dynamic environments. Is this a model you agree with? Do you
>> have a different idea?
>>
>>
> I think that is a great thing to shoot for. I'm personally a little
> cautious about making a deploy process depend on external services, but
> this could be fired off as a background job and it doesn't really matter
> too much if it works or not.
>
>
>> We have had a lot of conversations, on this list, and in person, around
>> 'what are people doing with puppet?' I think a tool like this could really
>> help us figure out which modules are being used the most often.
>>
>>
> Currently I answer this by trawling through a dump of the forge that we
> have available internally. However, my questions often revolve around how
> people are using the language rather than what modules are in use. That
> said, knowing which modules are heavily used would help everyone to
> understand a lot more.
>
>
>> Please note that PA is not nearly done yet. Much of the empty space I
>> expect will be filled in with cool visualizations of the data. It is liable
>> to break at any time, especially with actual users. One of the cool
>> features that is currently in PR is the ability to have shields.io
>> downloads tags come from PA and show up in the ReadMe's of our modules.
>>
>> Thanks everybody,
>> Spencer
>>
>> --
>> Spencer Krum
>> (619)-980-7820
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Puppet Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to puppet-dev+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/puppet-dev/CADt6FWPoK7N6pwPj4h6_84p-6WEwtz3N6zJbuJniRkHaMi9HBA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/puppet-dev/CADt6FWPoK7N6pwPj4h6_84p-6WEwtz3N6zJbuJniRkHaMi9HBA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Andrew Parker
> a...@puppetlabs.com
> Freenode: zaphod42
> Twitter: @aparker42
> Software Developer
>
> *Join us at PuppetConf 2014 <http://www.puppetconf.com/>, September
> 22-24 in San Francisco*
> *Register by May 30th to take advantage of the Early Adopter discount
> <http://links.puppetlabs.com/puppetconf-early-adopter> **—**save $349!*
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to puppet-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/puppet-dev/CANhgQXtn%2B2FT%3DVtxhUpYUpTv0ea1Be2L613MSHHROMeRd1jxQQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/puppet-dev/CANhgQXtn%2B2FT%3DVtxhUpYUpTv0ea1Be2L613MSHHROMeRd1jxQQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Spencer Krum
(619)-980-7820

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/CADt6FWO%3DG_r79aMRSM8H2DUZ3XJo2a1rZY3z00ust1v%3DvjUBaA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to