Re: analyzing popcon data for bogus recommends

2008-05-14 Thread Enrico Zini
On Tue, May 13, 2008 at 10:51:37PM -0400, Joey Hess wrote:

> It would be nice to have a list which Recommends are ignored/overridden
> the most when installing packages, to identify Recommends that need to be
> downgraded to Suggests. Could we derive such a list from popcon data? I
> think it would need to be done by analyzing each individual popcon data
> submission, so I can't do it as that data is not published.

Yes you can.  Also, there's a xapian database in my home directory
(~enrico/anapop/something IIRC) on people.debian.org that is built with
the popcon data, and you can query that database to quickly get a count
of "submissions having package X AND NOT package Y" and "package X AND
package Y".

That Xapian index indexes popcon submissions as if they were
"documents", and installed packages as if they were "terms".

The database is updated using a weekly cronjob that rescans the whole
popcon database.  I've quickly tried in the past[1] to come out with
ways to hook the indexing process into popcon so that I could do
realitime indexing of the data (it gives an up to date index and doesn't
suck 100% cpu on gluck once a week), but I got the impression that it
required having more discussion than I was motivated to have at the
time.  If more people are interested in using that xapian index, it can
make sense to rehash this.


Ciao,

Enrico

[1] 
http://lists.alioth.debian.org/pipermail/popcon-developers/2007-June/001374.html
-- 
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <[EMAIL PROTECTED]>


signature.asc
Description: Digital signature


Re: analyzing popcon data for bogus recommends

2008-05-13 Thread Petter Reinholdtsen
[Joey Hess]
> It would be nice to have a list which Recommends are
> ignored/overridden the most when installing packages, to identify
> Recommends that need to be downgraded to Suggests. Could we derive
> such a list from popcon data?

I have no idea if that can be done. :)

> I think it would need to be done by analyzing each individual popcon
> data submission, so I can't do it as that data is not published.

The raw popcon data is available for all Debian Developers at
popcon.debian.org.  Unable to log in to confirm the exact location at
the moment, but it is there somewhere. :)

Putting [EMAIL PROTECTED] on the CC list, as
it is a better place to discuss the use of popcon data.

Happy hacking,
-- 
Petter Reinholdtsen


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: analyzing popcon data for bogus recommends

2008-05-13 Thread Daniel Burrows
On Tue, May 13, 2008 at 11:09:20PM -0400, Felipe Sateler <[EMAIL PROTECTED]> 
was heard to say:
> Joey Hess wrote:
> 
> > It would be nice to have a list which Recommends are ignored/overridden
> > the most when installing packages, to identify Recommends that need to be
> > downgraded to Suggests. Could we derive such a list from popcon data? I
> > think it would need to be done by analyzing each individual popcon data
> > submission, so I can't do it as that data is not published.
> 
> I think you need more than popcon data: popcon doesn't say which packages were
> manually installed and which were automatically AFAIK. Maybe package B is
> installed and only recommended by A, but there is no way to tell if package B
> wasn't needed on it's own.

  It's true that you probably couldn't use this to find recommendations
that *should* exist, but if a Recommends is being widely ignored /
overridden (i.e., if the number of systems installed A but not B is
high), then it might be worth re-examining that dependency.

  Daniel


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: analyzing popcon data for bogus recommends

2008-05-13 Thread Felipe Sateler
Joey Hess wrote:

> It would be nice to have a list which Recommends are ignored/overridden
> the most when installing packages, to identify Recommends that need to be
> downgraded to Suggests. Could we derive such a list from popcon data? I
> think it would need to be done by analyzing each individual popcon data
> submission, so I can't do it as that data is not published.

I think you need more than popcon data: popcon doesn't say which packages were
manually installed and which were automatically AFAIK. Maybe package B is
installed and only recommended by A, but there is no way to tell if package B
wasn't needed on it's own.

-- 

  Felipe Sateler


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]