> On Jun 12, 2020, at 10:30 PM, Arjun Salyan <ar...@macports.org> wrote:
> 
> Hi Craig,
> 
> Thank you. You make a valid point regarding the possible distortion due to 
> weekly submissions being bundled to calculate monthly charts. 
> 
> But what we are seeing here is a known issue with the query that calculates 
> this chart.
> https://github.com/macports/macports-webapp/issues/79 
> <https://github.com/macports/macports-webapp/issues/79>
> 
> The current query has a limitation. Let’s say we receive two submissions from 
> a user within one month. One has port version X.1 and the other has upgraded 
> version X.2, then this query counts that user as using both the versions and 
> not just the latest one. This is the cause for the sudden jump in Mar 2020. 
> This problem is only with the "versions vs month" chart and should be fixed 
> soon. Rest all charts, including "installations by month" display accurate 
> information (https://ports.macports.org/port/gnuplot/stats?days=365 
> <https://ports.macports.org/port/gnuplot/stats?days=365>).
> 
> Thank you for the percentage suggestion. I am just wondering the right way to 
> display that information graphically.
> 
> I was trying to combine "installations by months" and "versions by month", 
> but it turns out they would be better separate.
> 
> Thank you
> 
> 

The following is a quick mockup of how versions over time might be reported:

https://drive.google.com/file/d/1piEDpd_rq5xnSMAgEsvLO5eu9uu70OpV/view?usp=sharing
 
<https://drive.google.com/file/d/1piEDpd_rq5xnSMAgEsvLO5eu9uu70OpV/view?usp=sharing>

To display percentages, I don’t think we need the ‘count distinct’.  Suppose 
only a single system is reporting that it uses a particular port and that port 
is updated during the month.  Suppose further, that the first two reports in 
the month from that single system say it is using version 1.0 and the last two 
say it is has version 1.1 installed.  Given the way we collect stats, I think 
it would be accurate to report usage as 50% for each of the versions of the 
port for that month.  Over the course of the month, that was what was reported.

Either that or only use the last report for the month and have the date 
displayed as the last day of each month.

Craig

Reply via email to