Re: [gentoo-dev] paper on oss-qm project

2010-05-08 Thread Sebastian Pipping
On 05/08/10 22:11, Enrico Weigelt wrote:
> what problems do you see w/ licensing ?
> 
> IMHO, each branch simply has to follow the upstream's license.

i have yet to see easy cases with licensing.
i haven't thought about it in detail yet, tough, to be honest.


> simply normalize: don't use letters but numbers.

i don't believe in simple normalization before i have seen it.


> b) it's not really a release but just a development snapshot - 
>that doesnt belong into the main oss-qm repository

why doesn't it belong in there?


> I've chosen that scheme to make the borders more clear (also for
> automatic filtering, etc). In my concept, the vendor is the major
> point of distinction, package comes at second, ... 

i guess we agree to disagree then.
i don't think the current scheme promotes cooperation well.


> Well, the term vendor here is defined as a party which provides
> packages in certain variants. "UPSTREAM" is a kind of meta vendor,
> describing the upstreams. "Vendor" is IMHO more generic, since there
> may be vendors who aren't actually a real distro. For example, I 
> myself don't publish a complete distro, but a foundation for clean
> building especially for special embedded devices or appliances.

yes, that's why i proposed "downstream" as a replacement.
you don't consider yourself downstream?


> Yes, that's still an open topic. I've chosen to use one big repo
> for easier maintenance, but I'm aware of the problem that the
> repo might become very fat some day.

my point is not about size, only about "users".


> I see two options:
> 
> a) split it off into several ones, eg. on per-package basis
>and create a system for (semi-)automatic mass-repo maintenance
>(not completely trivial when using free git hosters as mirrors)

are you aware that splitting it up will reduce the savings in space?
say if they all had byte-identical GPLv3 COPYING files that would be one
blob atm and N blobs in split mode.


> b) add an selective filtering system. AFIAK current stable git
>doesnt provide that yet - I've added an little patch for that:
>
> http://repo.or.cz/w/oss-qm-packages.git/shortlog/refs/heads/METUX.git.master

while i'm not sure about this in detail yet, could it be this loop
misses to filter the very first entry?

+   while (walk && (walk->next))
+   {
+   if (_filter_remote_ref(transport, walk->next))
+   walk->next = walk->next->next;
+   else
+   walk = walk->next;
+   }
+

best,




sebastian



[gentoo-dev] Package managers and new repositories formats

2010-05-08 Thread Auke Booij
Hey,

As part of Google Summer of Code, together with two other students and
our mentors, I'm looking into supporting non-ebuild repository formats
in existing package managers. Currently, in Portage these are
supported by externally generating ebuilds from available metadata in
advance, and then using these in Portage by putting them in a local
overlay (see g-cpan for an example). In Paludis, there is some
"native" support for non-ebuild repository types (they have exheres,
just for starters), and I've been told it's fairly trivial to add new
repository types in Pkgcore, too, simply by inheriting from a base
class.

My point is of course not to bash on Portage, but to come up with an
elegant solution to support a list of new repository types.
Personally, Google is about to pay me for adding support for R package
repositories (CRAN and Bioconductor), and the two other students will
be doing Pypi and PEAR support, respectively. We aren't looking
forward to doing one task nine times: support for package managers.
See, our projects communicate with two others: on one side, we have to
read and interpret package repositories, on the other side, we need to
communicate with all the different package managers. Reading
repositories is something we'll only have to implement once for a
given repository format, but passing the relevant data to portage,
pkgcore or paludis is something less trivial.

We, students and mentors, have thought of various plans to only have
to write repository "plugins" and tackle the package manager side
together once and for all, but we couldn't reach agreement. How can we
support the three existing package managers, and any future package
manager which only supports PMS? To accomplish PMS-only package
manager support, the repository code would at least have to be able to
generate ebuilds, but perhaps we could come up with something better
than simply translating one type of metadata into the other, for
slightly more advanced package managers? Could we perhaps come up with
a *standard* for non-ebuild repository type /definitions/? Ideally,
developers would only have to write one chunk of code to read a
repository format, and then all package managers would be able to read
repositories of that type.

Now, before we roll in the discussion of stability: yes, blindly
importing packages from upstream is dangerous, and yes, it may just
set your cat on fire. However, it's a matter of fact that the package
maintainers simply don't have the time to manually check all packages,
and isn't Gentoo all /about/ at least having /access/ to those nuclear
plants? If this monster turns out to be deadly, we can always disable
support by default. That said, I do believe many packages, especially
in CRAN and Bioconductor, eventually will not do much more harm than
cause inflation of the U.S. dollar, which honestly cannot be helped
anyway. Wait, did I say Bioconductor? Oh, then terrorists with access
to Gentoo may have less trouble developing their own biological
weapons, but so be it, as long as they publish their DNA code as
GPL...

The final questions I'm really here for, then, are: how do you feel
about standardizing repository format definitions, how should we
support new repository types in current package managers, how should
we go about constructing a common interface for new format definitions
and why is it always on days like these I run out of coffee?

Thanks for all your thoughts!
Auke Booij / tulcod.



Re: [gentoo-dev] paper on oss-qm project

2010-05-08 Thread Enrico Weigelt
* Sebastian Pipping  schrieb:

Hi,

> interesting concept.  i'd like to comment on a few details:
> 
>  - licensing seems not be addressed, yet.
>licensing can kill everything, it needs consideration.

what problems do you see w/ licensing ?

IMHO, each branch simply has to follow the upstream's license.

>  - branch and tag namespaces as currently defined have a few problems:
> 
>- versioning:
> 
>  - the A.B.C.D scheme won't be fun to gentoo, both
>due to no-letters-in-here and because of no-pre-releases.
>while at that keeping pre-releases does not seem helpful to me.

simply normalize: don't use letters but numbers. and I actually don't
see any need for pre-releases: 

a) it' an real release - then it has to fit into the (linear) 
   versioning scheme
b) it's not really a release but just a development snapshot - 
   that doesnt belong into the main oss-qm repository
 
>- vendor concept:
> 
>  - uppercase vendor names look rather odd, especially with project
>names in lowercase.
> 
>  - having the vendor first makes no sense to me.
>a "package.vendor.subbranch" keeps all zlibs together,
>instead of all gentoo stuff.  if the project is about
>packages, that makes more sense to me.

I've chosen that scheme to make the borders more clear (also for
automatic filtering, etc). In my concept, the vendor is the major
point of distinction, package comes at second, ... 

>  - renaming the concept to "downstream" would make it
>fit better.  gentoo is not a vendor to me.

Well, the term vendor here is defined as a party which provides
packages in certain variants. "UPSTREAM" is a kind of meta vendor,
describing the upstreams. "Vendor" is IMHO more generic, since there
may be vendors who aren't actually a real distro. For example, I 
myself don't publish a complete distro, but a foundation for clean
building especially for special embedded devices or appliances.

>  - with one git repo used for many packages people
>will need to know how to clone single branches only.
>most git users probably won't, you will need to teach them.
>the PDF seems a good place to do that.

Yes, that's still an open topic. I've chosen to use one big repo
for easier maintenance, but I'm aware of the problem that the
repo might become very fat some day. I see two options:

a) split it off into several ones, eg. on per-package basis
   and create a system for (semi-)automatic mass-repo maintenance
   (not completely trivial when using free git hosters as mirrors)

b) add an selective filtering system. AFIAK current stable git
   doesnt provide that yet - I've added an little patch for that:
   http://repo.or.cz/w/oss-qm-packages.git/shortlog/refs/heads/METUX.git.master
   

cu
-- 
-
 Enrico Weigelt==   metux IT service - http://www.metux.de/
-
 Please visit the OpenSource QM Taskforce:
http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
http://patches.metux.de/
-