Re: Re: remark about the UDD upload_history table

2024-02-28 Thread Patrice Duroux
Hi Lucas,

My curiosity sometimes pushes me to look at the list of packages that haven't
been uploaded to the archive both in unstable and experimental releases in more
than a certain number of years. And UDD is so useful for that!!

Many thanks,
Patrice



Re: remark about the UDD upload_history table

2024-02-27 Thread Lucas Nussbaum
Hi,

> Le mer. 14 févr. 2024 à 19:36, Patrice Duroux
>  a écrit :
> >
> > Hi,
> >
> > I am a bit surprised with the following output:
> >
> > udd=> select distinct distribution from upload_history;
> >   distribution
> > -

I filed a bug about that, but unfortunately the whole upload_history
gatherer code is fragile, because it is based on parsing the mailing
list archives... So I'm not sure this will be fixed at some point.

On 19/02/24 at 11:15 +0100, Patrice Duroux wrote:
> Hi,
> 
> May I also suggest renaming this 'upload_history.distribution' column
> to 'upload_history.release' for a better consistency with the other
> tables that have the two columns (distribution and release)?

UDD generally follows the naming from the source data. I agree that
there's some confusion between distribution and release that requires
extra care when writing queries, but I don't think that we should change
that at this point given it would be an API change and there might be 
external consumers of the raw (SQL) data.

Lucas



Re: remark about the UDD upload_history table

2024-02-19 Thread Patrice Duroux
Hi,

May I also suggest renaming this 'upload_history.distribution' column
to 'upload_history.release' for a better consistency with the other
tables that have the two columns (distribution and release)?

More generally there are some other places in the UDD source code
where it is confusing, for instances here (extract using rgrep):

web/cgi-bin/arch.cgi:#$query = "select
architecture,distribution,state,count(*) from wannabuild where
distribution ='sid' and state='Installed' group by
architecture,distribution,state order by count;";
web/cgi-bin/arch.cgi:$arch_info->{$arch}->{"longest_build"} =
$rowsc->{"package"}."/".$rowsc->{"distribution"};
web/cgi-bin/arch.cgi:
$arch_info->{$arch}->{"longest_needsbuild"} =
$rowsc->{"package"}."/".$rowsc->{"distribution"};

Here it is with the 'wannabuild' table.

web/cgi-bin/mentors.cgi:pkgs = dbget("select mp.name, mpv.version,
mpv.distribution, pc.insts, mpv.uploaded
web/cgi-bin/mentors.cgi:bugsourceversiondistributionuploadedpopconin
Debianstatuslinks
web/cgi-bin/mentors.cgi:if l['distribution'] == 'unstable'
web/cgi-bin/mentors.cgi:elsif l['distribution'] == 'experimental'
web/cgi-bin/mentors.cgi:  puts "#{l['distribution']}"

And here it is the 'mentors_most_recent_package_versions' view based
on the 'mentors_raw_uploads' table.

Could this be improved or too late?
How? (some SQL ALTER with cgi patches?)

Regards,
Patrice

Le mer. 14 févr. 2024 à 19:36, Patrice Duroux
 a écrit :
>
> Hi,
>
> I am a bit surprised with the following output:
>
> udd=> select distinct distribution from upload_history;
>   distribution
> -
>  experimental
>  froxzen unstable
>  froze unstable
>  frozen
>  frozen  unstable
>  frozen unstable
>  frozen unstable contrib
>  frozen woody
>  frozen-contrib contrib
>  non-free
>  rc-buggy
>  sid
>  sid=20
>  stable
>  stable frozen unstable
>  stable unstable
>  stable-security
>  testing
>  testing unstable
>  testing-security
>  unstable
>  unstable  frozen
>  unstable contrib
>  unstable frozen
>  unstable non-free
>  unstable stable
>  unstable stable frozen
>  unstable testing
>  unstable testing stable
>  unstable unstable
>  unstable=20
>  woody-proposed-updates unstable
> (32 lignes)
>
>
> I am then not sure about the consistency of another SQL query based on
> this column value.
>
> Regards,
> Patrice