Re: [gentoo-dev] Re: Tags (Was: RFC: split up media-sound/ category)

2011-06-25 Thread Kent Fredric
On 25 June 2011 00:51, Nathan Phillip Brink bi...@gentoo.org wrote:
 On Wed, Jun 22, 2011 at 11:20:40PM -0400, Wyatt Epp wrote:
 I bring this up because there are several packages with the same name
 and different qualification.  Obviously, they'll have different tags
 because they're not the same thing, but neither should they share the
 same directory.  So the simple solution is to just change the package
 names so we avoid collision and preserve our flat ontology (I've
 forgotten the objection to doing this; please forgive).

 I believe that the objection is that it is better to follow upstreams'
 package names as directly as possible. This would look better and be
 less confusing than having a package named git and git-core, like I've
 seen elsewhere. Having categories would also prevent changing an
 ebuild's name from upstream's name only for the sake of giving it a
 unique name in Gentoo.

You could possibly abuse the tag feature to solve this issue in
interesting ways.

Both could have:

tag class=common-namegit/tag

and thus people looking for 'git' would be more likely to find it.

You could also perhaps extend this idea to other forms of metadata
that users are likely to want to be able to search, perhaps:

tag class=provides-binarygit/tag

But that specific usage would probably be deemed slightly abusive. (
as for instance, you may wish to also list what the binary is usually
called, and what gentoo ships it as to prevent collisions )


 I think that in most cases, when package name collisions happen, the
 colliding packages differ enough that they'd conceivably be in
 different portage categories, letting them be uniquely identified in
 Gentoo. If someone is planning on writing a new program, he likely
 knows about already-existing alternatives to this package. The author
 of a new sound editing suite would not name his suite `sox' because
 the author cannot help but to know that media-sound/sox exists. But
 someone writing some new sax thing might play off of `sax' and name it
 `sox', though this is hypothetical ;-).


But with this, you could store one as media-sound/sox-translator and
the other as media-sound/sox-saxophone  or something equally unique
but arbitrary and tag them both with

tag class=common-namesox/tag

Also, with this in mind, it may be better to have some types of tags
that are only aggregated in an index of sorts, and others which are
also perhaps made available as a tree of symlinks.

 --
 binki

 Look out for missing or extraneous apostrophes!




-- 
Kent

perl -e  print substr( \edrgmaM  SPA NOcomil.ic\\@tfrken\, \$_ * 3,
3 ) for ( 9,8,0,7,1,6,5,4,3,2 );

http://kent-fredric.fox.geek.nz



Re: [gentoo-dev] Tags (Was: RFC: split up media-sound/ category)

2011-06-25 Thread Kent Fredric
On 25 June 2011 00:57, Nathan Phillip Brink bi...@gentoo.org wrote:
 On Wed, Jun 22, 2011 at 08:57:47PM -0400, Wyatt Epp wrote:

 cat
 tagmedia/tag
 tagvideo/tag
 tagkde/tag
 tageditors/tag
 /cat


I'm strongly of the mind that by making the tag system arbitrarily
flat, you might be prematurely limiting yourself, as well as risking a
future where the tag index is a sea of meaningless words.

Tags in my mind, should be grouped by the sort of information they are
trying to convey, as opposed to being arbitrary and completely
un-grouped.

The present category system only has one namespace, which is more or
less what-you-use-it-for, and if your tag system is likewise going
to take that vector as the only approach, you will ultimately end up
duplicating the category system, albeit without the present limitation
that means one package can only exist in one place.

This need not be the case, we can suggest alternative tag namespaces,
such as : The sorts of files it supports working with, the sorts of
things it can read, the sorts of things it can write.

At present, things that migrate one type of media to another, such as
pdf - image , image - pdf, image - video , video - images , etc
have to be forced to a sort of useless categorisation system.

However, if via tag data, we were able to annotate a) what can be
written and b) what can be read, this system could be leveraged to
epic proportions of win.

   tag-lookup --supporting $( file ./foo );
   
   Read/Write:
   foobarnator - Blah blah blah
   Read:
   foo-bar   - Blah blah
   foo-bjaz - Blah blah blah
  Write:
   a2foo - Blah Blah


   tag-lookup --verbose --supporting $( file ./foo );
   
   Read/Write:
   foobarnator - Blah blah blah
- reads x , y , z , foo
- writes a, b, c, foo
   Read:
   foo-bar   - Blah blah
- reads foo
- writes text
   foo-bjaz - Blah blah blah
- reads foo, bar
- writes text, mp3
  Write:
   a2foo - Blah Blah
  -reads mp3, png, jpeg
  -writes foo


As a side note, it may be beneficial to tag a package version
specifically for some of the above mentioned features. Especially if
you wish to support my provides-binary suggestion, because the
shipped binary may change from one version/slot to another.

I'm not sure if there's a way to provide data on a per-version level
yet in Metadata.xml, but I am assuming there's not as I don't see it
documented.

pkgmetadata
 versionspecific
 slot2/slot
 pkgmetadata
 ... normal stuff
 /pkgmetadata
/versionspecific
 versionspecific
 maxversion1.0/maxversion
 maxversion1.999/maxversion
 pkgmetadata
 ... normal stuff
 /pkgmetadata
/versionspecific
/pkgmetadata


Or something similar.



-- 
Kent

perl -e  print substr( \edrgmaM  SPA NOcomil.ic\\@tfrken\, \$_ * 3,
3 ) for ( 9,8,0,7,1,6,5,4,3,2 );

http://kent-fredric.fox.geek.nz



Re: [gentoo-dev] Please migrate to git-2.eclass

2011-06-25 Thread justin
On 6/24/11 11:35 PM, Michał Górny wrote:
 Hello,
 
 git-2.eclass is in the tree for a while now, and there's still awful
 lot of packages using old  deprecated git.eclass.
 
 Why migrate?

Hi,

What are the pitfalls during migration? Or is it just /git/git-2/ ?

Thanks jusitn



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] validity of manifest signing key

2011-06-25 Thread justin
Hi,

I was signing my commits since I am a dev, but I just discovered that I
only do sha1 signing. How do I switch to sha256 signing?


justin



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] Please migrate to git-2.eclass

2011-06-25 Thread Michał Górny
On Sat, 25 Jun 2011 09:11:16 +0200
justin j...@gentoo.org wrote:

 On 6/24/11 11:35 PM, Michał Górny wrote:
  Hello,
  
  git-2.eclass is in the tree for a while now, and there's still awful
  lot of packages using old  deprecated git.eclass.
  
  Why migrate?
 
 What are the pitfalls during migration? Or is it just /git/git-2/ ?

In the most common case, yes. But you are advised to add a http
fallback in EGIT_REPO_URI.

Other API changes are:
- EGIT_*_CMD is no longer overridable,
- src_prepare() is no longer exported (EGIT_BOOTSTRAP is called
  in src_unpack() but I don't recommend that, EGIT_PATCHES gone),
- EGIT_QUIET gone, EGIT_UNPACK_DIR gone (changes to other vars prolly).

So I think most of these won't apply to the ebuilds, maybe the second
one.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] validity of manifest signing key

2011-06-25 Thread Michał Górny
On Sat, 25 Jun 2011 09:37:55 +0200
justin j...@gentoo.org wrote:

 I was signing my commits since I am a dev, but I just discovered that
 I only do sha1 signing. How do I switch to sha256 signing?

$ grep digest ~/.gnupg/gpg.conf 
personal-digest-preferences sha256,sha512,sha1,ripemd160,md5

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


[gentoo-dev] Re: Please migrate to git-2.eclass

2011-06-25 Thread Nikos Chantziaras

On 06/25/2011 12:35 AM, Michał Górny wrote:

Hello,

git-2.eclass is in the tree for a while now, and there's still awful
lot of packages using old  deprecated git.eclass.


I think I remember seeing deprecation warnings in the past when an 
ebuild was using a deprecated eclass (right at the beginning when the 
emerge starts.)  Perhaps it would be a good idea to add one of those in 
git.eclass.





Re: [gentoo-dev] Tags (Was: RFC: split up media-sound/ category)

2011-06-25 Thread Wyatt Epp
On Sat, Jun 25, 2011 at 02:49, Kent Fredric kentfred...@gmail.com wrote:
 I'm strongly of the mind that by making the tag system arbitrarily
 flat, you might be prematurely limiting yourself, as well as risking a
 future where the tag index is a sea of meaningless words.

 Tags in my mind, should be grouped by the sort of information they are
 trying to convey, as opposed to being arbitrary and completely
 un-grouped.

 The present category system only has one namespace, which is more or
 less what-you-use-it-for, and if your tag system is likewise going
 to take that vector as the only approach, you will ultimately end up
 duplicating the category system, albeit without the present limitation
 that means one package can only exist in one place.

 This need not be the case, we can suggest alternative tag namespaces,
 such as : The sorts of files it supports working with, the sorts of
 things it can read, the sorts of things it can write.

 At present, things that migrate one type of media to another, such as
 pdf - image , image - pdf, image - video , video - images , etc
 have to be forced to a sort of useless categorisation system.

 However, if via tag data, we were able to annotate a) what can be
 written and b) what can be read, this system could be leveraged to
 epic proportions of win.

Okay, apologies in advance for my long-windedness.  I hope this all
makes sense to everyone.

I should probably clarify that cloying strictly to flatness is not
what I'm proposing.  Reality has borne out the need for implications
and aliases in sanitising an unruly dataset with a complex
user-generated index, while arbitrary democratised group building has
improved some aspects of discovery.  However, I would consider these
features to be a lower priority than having a system at all.

So to break it down:
Tags - a concise vocabulary used for search.  In their default state
they are untyped and non-hierarchical.  They identify traits of a
package.  Suggest using lower-case and simple, descriptive naming
conventions. Highest priority.
Example: alien {{converter nogui package_management reads_tgz
reads_rpm reads_pkg reads_slp reads_lsb writes_tgz writes_rpm
writes_pkg writes_slp writes_lsb}}

Alias - a relationship between two tags establishing equivalence.
Query of the left term returns results of the right.  This type of
relationship helps reduce dictionary clutter. Low priority.
Example: sound = audio.  Attempting to add sound to a package will
instead add audio and searches for sound will return the results for
audio.

Implication - a relationship between two tags where the presence of
the left term necessarily requires the right.  This relationship
reduces menial work.  Low priority.
Example: mpd - audio.  Adding mpd to the package will also add audio.

Kent, your idea is pretty interesting and I rather like it.
Fortunately, it's completely possible within the context of the basic
flat layout, as I outlined with Alien above.  It probably looks ugly
to you-- this is no illusion; it's pretty ugly.  But it also grants us
the flexibility to get a basic system in place quickly and without a
lot of hassle.  We get 90% of the benefit up front, and can extend it
as necessary.

Unfortunately for real hierarchical methods, people still have
difficulty with even simple metadata systems.  Fetch some MP3s off the
internet and check their tags or look at search engine queries and
you'll find an entire class of people hampered by what is currently a
largely alien art.  In the end, this system needs to be usable by
people and by keeping it primarily flat, we ease the conceptual
overhead of its implementation and its use.  If it can't be
implemented on itch-scratching timescales, we have failed.  If people
can't use it with very little learning curve, we have failed.

A word on vocabulary:
As you've no doubt noticed, there seems to be a degree of combinatoric
explosion of tags in the method I propose.  In practical use, it's not
as bad as it looks.  For Gentoo, I'd recommend a basic canonical
list of general tags based on the current category system (subject to
discussion and addition/subtraction) and incorporate suggestions like
Kent's as they come up.  It's okay to control the vocabulary.  What
you find is that after the initial implementation, it grows fairly
slowly. (Even with reads_* and writes_* the number will probably be
south of 500 tags for a long time; the current categories dissolve
into about 175 tags from what I can see.)

Regards,
Wyatt



Re: [gentoo-dev] RFC: split up media-sound/ category

2011-06-25 Thread Maciej Mrozowski
On Tuesday 21 of June 2011 16:24:07 Michał Górny wrote:
 Hello,
 
 As we discussed for a while, the media-sound/ category has grown very
 large and it may be a good idea to split it.
 
 Right now, it contains audio players, editing software, converters,
 sound systems and a lot of other utilities related to sound. Splitting
 that up would make looking up software easier for users (e.g. if I want
 to take a look at what audio players we have, I don't need to see all
 other programs).
 
 What do you think? What new category/-ies do you suggest?

I'd suggest to start thinking about dropping broken categories concept and 
introduce (not necessarily replace instantly) flat, but tagged package list 
(so that real vector search on tags can be utilized).

-- 
regards
MM


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] RFC: split up media-sound/ category

2011-06-25 Thread Maciej Mrozowski
On Friday 24 of June 2011 09:55:19 Ciaran McCreesh wrote:
 On Fri, 24 Jun 2011 09:52:03 +0200
 
 Jesús J. Guerrero Botella jesus.guerrero.bote...@gmail.com wrote:
  You might not like it, but Gentoo categories have always been
  directories, not words into metadata.xml.
 
 So tags are in some way related to categories then?

IMHO the best approach is to forget about categories and:

- make package names unique identifiers (it's not that hard, renaming stuff in 
app-xemacs mostly) - categories would serve no purpose as id anymore (though 
may need to be provided as backward compatibility - but with symlinks to 
ebuilds/${PN} inside)

- move such packages into ${PORTDIR}/ebuilds directory (so that identity is 
ensured on filesystem level) - 'ebuilds' name doesn't seem to be reserved 
anywhere so good candidate imho.
To those concerned with directory lookup speed (in order to find package by 
name) - generated package index file provided in ${PORTDIR}

- extend their metadata.xml (no ebuild variables please) with tags in accepted 
format. We should provide dictionary for available tags - necessary in order 
to avoid randomly added system tags - tag could be extended when needed - 
similar policy to global USE flags for instance

- package manager needs to be make aware of tags of course in order to allow 
package list (not tree anymore) searching and filtering
(virtual package tree can be generated from tag - by number of tag occurrences 
in packages - for instance all packages with tag kde could be shown 
somewhere within kde tag subtree etc)

- no tag related symlinks please

-- 
regards
MM


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] RFC: split up media-sound/ category

2011-06-25 Thread Kent Fredric
On 25 June 2011 23:55, Maciej Mrozowski reave...@gmail.com wrote:

 IMHO the best approach is to forget about categories and:

 - make package names unique identifiers (it's not that hard, renaming stuff in
 app-xemacs mostly) - categories would serve no purpose as id anymore (though
 may need to be provided as backward compatibility - but with symlinks to
 ebuilds/${PN} inside)


I think something else that may be important to consider if one is
eliminating category directories is how we'll replace the utility
currently provided by category/metadata.xml

Some things are simply grossly impractical to maintain individual
metadata.xml for reliably due to volume ( ie: dev-perl/* , last I
looked, the metadata.xml in there presently is largely copy-pasted
between dists )

Perhaps we need a new way to apply metadata to a whole host of packages?

Also, categories have extra use for simple convenience of their native
groupings, ie: I've been known to set USE flags/KEYWORDS that apply to
an entire category.  Trying to make useflags apply to all packages
with a given tagset would be comparatively messy.

categories also make it easy to do Naïve iteration of packages
efficiently, ie: for the most part, if you want to iterate all
perl-modules, you just need to iterate dev-perl and perl-core , and
that is all, you're not bogged down by stepping into all the other
categories, loading all their files and working out whether or not
they're perl related. ( Yes, I am aware this has its own caveats, but
if you know of these caveats and they're acceptable to your task, then
its fine )

the 'virtuals' category also is a bundle of fun. I really do not want
to see virtuals identified only by whatever their unique-idenitifier
might be and the tag 'virtual'. Yuck.




-- 
Kent

perl -e  print substr( \edrgmaM  SPA NOcomil.ic\\@tfrken\, \$_ * 3,
3 ) for ( 9,8,0,7,1,6,5,4,3,2 );

http://kent-fredric.fox.geek.nz



Re: [gentoo-dev] RFC: split up media-sound/ category

2011-06-25 Thread Michał Górny
On Sat, 25 Jun 2011 13:55:40 +0200
Maciej Mrozowski reave...@gmail.com wrote:

 On Friday 24 of June 2011 09:55:19 Ciaran McCreesh wrote:
  On Fri, 24 Jun 2011 09:52:03 +0200
  
  Jesús J. Guerrero Botella jesus.guerrero.bote...@gmail.com wrote:
   You might not like it, but Gentoo categories have always been
   directories, not words into metadata.xml.
  
  So tags are in some way related to categories then?
 
 IMHO the best approach is to forget about categories and:
 
 - make package names unique identifiers (it's not that hard, renaming
 stuff in app-xemacs mostly) - categories would serve no purpose as id
 anymore (though may need to be provided as backward compatibility -
 but with symlinks to ebuilds/${PN} inside)

And we'll all be doing a lot of ugly ${PN/python-/} and so on. Simplify
things, not make harder just because of some wannabe tag fashion.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


[gentoo-dev] SHA256 and indention in metadata.xml

2011-06-25 Thread justin
Hi all,

so I solved my signing question. With a 1024 DSA key you need to add

enable-dsa2

personal-digest-preferences SHA256

to your gpg.conf.

Another question, do we have a rule, how the metadata.xml has to be
indented? Tabs or n spaces?


thanks justin





signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] RFC: split up media-sound/ category

2011-06-25 Thread Michał Górny
On Sun, 26 Jun 2011 00:22:39 +1200
Kent Fredric kentfred...@gmail.com wrote:

 Perhaps we need a new way to apply metadata to a whole host of
 packages?
 
 Also, categories have extra use for simple convenience of their native
 groupings, ie: I've been known to set USE flags/KEYWORDS that apply to
 an entire category.  Trying to make useflags apply to all packages
 with a given tagset would be comparatively messy.

Yep, we dropped old-style virtuals and now we want a new mess.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] Re: Please migrate to git-2.eclass

2011-06-25 Thread Nirbheek Chauhan
On Sat, Jun 25, 2011 at 1:26 PM, Nikos Chantziaras rea...@arcor.de wrote:
 On 06/25/2011 12:35 AM, Michał Górny wrote:

 Hello,

 git-2.eclass is in the tree for a while now, and there's still awful
 lot of packages using old  deprecated git.eclass.

 I think I remember seeing deprecation warnings in the past when an ebuild
 was using a deprecated eclass (right at the beginning when the emerge
 starts.)  Perhaps it would be a good idea to add one of those in git.eclass.


That's a horribly bad idea. Users should never need to see such
things. The example you're thinking of is python.eclass, and that
resulted in confused users filing bug reports.

There's currently a repoman warning for git.eclass usage, and that suffices.

-- 
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team



Re: [gentoo-dev] SHA256 and indention in metadata.xml

2011-06-25 Thread Nirbheek Chauhan
On Sat, Jun 25, 2011 at 6:16 PM, justin j...@gentoo.org wrote:
 Another question, do we have a rule, how the metadata.xml has to be
 indented? Tabs or n spaces?


There's no rule, but we should follow the same rule as ebuilds —
indentation should be with a tab that's displayed as 4 spaces in
editors (no expansion of tabs to spaces).

-- 
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team



Re: [gentoo-dev] RFC: split up media-sound/ category

2011-06-25 Thread Wyatt Epp
On Sat, Jun 25, 2011 at 08:22, Kent Fredric kentfred...@gmail.com wrote:
 I think something else that may be important to consider if one is
 eliminating category directories is how we'll replace the utility
 currently provided by category/metadata.xml

 Some things are simply grossly impractical to maintain individual
 metadata.xml for reliably due to volume ( ie: dev-perl/* , last I
 looked, the metadata.xml in there presently is largely copy-pasted
 between dists )

Looking at the category/metadata.xml, it's a multilingual dictionary
entry and little else.  So make a dictionary of tags (categories).
And what does the latter half have to do with tagging things?  Where's
the maintenance?  There's the overhead of tagging it once, I'll grant.
 And then?  Tags are unlikely to change all that frequently once
they've been added (they don't need to).

 Perhaps we need a new way to apply metadata to a whole host of packages?

 Trying to make useflags apply to all packages
 with a given tagset would be comparatively messy.

Why do you think that?  The directory-like notation doesn't even need
to be discarded:
perl_module/* ssl

 categories also make it easy to do Naïve iteration of packages
 efficiently, ie: for the most part, if you want to iterate all
 perl-modules, you just need to iterate dev-perl and perl-core , and
 that is all, you're not bogged down by stepping into all the other
 categories, loading all their files and working out whether or not
 they're perl related. ( Yes, I am aware this has its own caveats, but
 if you know of these caveats and they're acceptable to your task, then
 its fine )

Or just iterate over the perl_module tag.

 the 'virtuals' category also is a bundle of fun. I really do not want
 to see virtuals identified only by whatever their unique-idenitifier
 might be and the tag 'virtual'. Yuck.

In the first place, it's still no different: mysql (the virtual) pulls
in db-mysql (or charles or whatever name sounds good) whatever else
is available.  Or, as I mentioned before, while unique identifiers are
really terribly simple, we are fully capable of working around the
lack of that feature.  What prevents virtual/mysql from pulling in
database/mysql?

Regards,
Wyatt



[gentoo-dev] Re: RFC: split up media-sound/ category

2011-06-25 Thread Duncan
Maciej Mrozowski posted on Sat, 25 Jun 2011 13:55:40 +0200 as excerpted:

 On Friday 24 of June 2011 09:55:19 Ciaran McCreesh wrote:

 So tags are in some way related to categories then?
 
 IMHO the best approach is to forget about categories and:
 
 - make package names unique identifiers (it's not that hard, renaming
 stuff in app-xemacs mostly) - categories would serve no purpose as id
 anymore (though may need to be provided as backward compatibility - but
 with symlinks to ebuilds/${PN} inside)

What a beautiful bikeshed we're debating! =:^p

 - move such packages into ${PORTDIR}/ebuilds directory (so that identity
 is ensured on filesystem level) - 'ebuilds' name doesn't seem to be
 reserved anywhere so good candidate imho.
 To those concerned with directory lookup speed (in order to find package
 by name) - generated package index file provided in ${PORTDIR}

Alternatively, just use first letter subdivisions.  Perhaps grouping them 
as ac, df, etc, or whatever granularity seems appropriate, if desired.  
That's a common method of eliminating large-dir issues with otherwise 
flat listings.

 - extend their metadata.xml (no ebuild variables please) with tags in
 accepted format. We should provide dictionary for available tags -
 necessary in order to avoid randomly added system tags - tag could be
 extended when needed - similar policy to global USE flags for instance

Keep in mind that there has historically been extremely high resistance 
to xml-ifying anything critical to operational package management, by 
certain highly respected and politically influential gentoo devs.  
There's a reason metadata.xml contains only ancillary data, while the 
most important operational data (depends, inherits, src_uri, etc) remains 
as variables within the ebuilds and/or eclasses.

I never tracked who was so stridently opposed and it may well be that 
they've retired now, but there's some people who simply don't consider xml 
a sufficiently robust solution in terms of parsing dependencies AND easy 
error-free human parsability.

FWIW, I agree that it'd be a step backward in terms of human editability 
ease and thus I'd find it a sad day were that to happen, but my feelings 
aren't particularly strong on the issue.

But if packages are indeed uniquely and canonically identified by name 
only and tags are kept as ancillary to the core merge process as 
metadata.xml is now, there shouldn't be a problem with it.

Just a warning that here be dragons for anyone thinking about going 
down that path.  Consider reading up in the list archive for past debates 
on the subject.

 - package manager needs to be make aware of tags of course in order to
 allow package list (not tree anymore) searching and filtering (virtual
 package tree can be generated from tag - by number of tag occurrences in
 packages - for instance all packages with tag kde could be shown
 somewhere within kde tag subtree etc)

As long as it's kept out of critical operational functionality... but 
this seems to be getting pretty close, if the package manager needs to 
be [made] aware of tags.  This would have been unlikely to have gotten 
thru at one point... but as I said, it's possible that opposition isn't 
any longer a factor.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman




Re: [gentoo-dev] Re: RFC: split up media-sound/ category

2011-06-25 Thread Maciej Mrozowski
On Saturday 25 of June 2011 17:19:47 Duncan wrote:
 Maciej Mrozowski posted on Sat, 25 Jun 2011 13:55:40 +0200 as excerpted:
  On Friday 24 of June 2011 09:55:19 Ciaran McCreesh wrote:
  So tags are in some way related to categories then?
  
  IMHO the best approach is to forget about categories and:
  
  - make package names unique identifiers (it's not that hard, renaming
  stuff in app-xemacs mostly) - categories would serve no purpose as id
  anymore (though may need to be provided as backward compatibility - but
  with symlinks to ebuilds/${PN} inside)
 
 What a beautiful bikeshed we're debating! =:^p

No bikeshedding, just any mean necessary (I'd be fine with anything) in order 
to phase out categories from being necessary for critical package manager 
operations.

  - move such packages into ${PORTDIR}/ebuilds directory (so that identity
  is ensured on filesystem level) - 'ebuilds' name doesn't seem to be
  reserved anywhere so good candidate imho.
  To those concerned with directory lookup speed (in order to find package
  by name) - generated package index file provided in ${PORTDIR}
 
 Alternatively, just use first letter subdivisions.  Perhaps grouping them
 as ac, df, etc, or whatever granularity seems appropriate, if desired.
 That's a common method of eliminating large-dir issues with otherwise
 flat listings.

Using directory structure as a way to enhance performance is sign of bad 
design. Simple
find /usr/portage/ebuilds -type d -maxdepth 1 | sort  ebuilds.index should be 
sufficient. One can even extract tags in that file and list them after package 
name for faster searching.

  - extend their metadata.xml (no ebuild variables please) with tags in
  accepted format. We should provide dictionary for available tags -
  necessary in order to avoid randomly added system tags - tag could be
  extended when needed - similar policy to global USE flags for instance
 
 Keep in mind that there has historically been extremely high resistance
 to xml-ifying anything critical to operational package management, by
 certain highly respected and politically influential gentoo devs.
 There's a reason metadata.xml contains only ancillary data, while the
 most important operational data (depends, inherits, src_uri, etc) remains
 as variables within the ebuilds and/or eclasses.

Yes, and the reason is metadata.xml can contain only version invariant data 
and should not contain anything that's required for ebuild.sh. So inherits, 
src_uris, dependencies - cannot be placed there. Assuming package names are 
unique identifiers, tags are not necessary to be available for ebuild.sh so 
metadata.xml is the best place.

 I never tracked who was so stridently opposed and it may well be that
 they've retired now, but there's some people who simply don't consider xml
 a sufficiently robust solution in terms of parsing dependencies AND easy
 error-free human parsability.
[...]

Let's not diverge this purely technical topic to who thought what on what 
based on sth or there are some people who don't consider xml... and let 
them speak on technical matters if they want.

 FWIW, I agree that it'd be a step backward in terms of human editability
 ease and thus I'd find it a sad day were that to happen, but my feelings
 aren't particularly strong on the issue.
 
 But if packages are indeed uniquely and canonically identified by name
 only and tags are kept as ancillary to the core merge process as
 metadata.xml is now, there shouldn't be a problem with it.

No, tags are no supposed to be critical for package manager operations.
Package manager needs to be aware of them in order to provide useful 
searching, but that's about it.

I think we'd just need some simplest proof of concept implementation...

-- 
regards
MM


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] SHA256 and indention in metadata.xml

2011-06-25 Thread Mike Frysinger
On Sat, Jun 25, 2011 at 10:23, Nirbheek Chauhan wrote:
 On Sat, Jun 25, 2011 at 6:16 PM, justin wrote:
 Another question, do we have a rule, how the metadata.xml has to be
 indented? Tabs or n spaces?

 There's no rule, but we should follow the same rule as ebuilds —
 indentation should be with a tab that's displayed as 4 spaces in
 editors (no expansion of tabs to spaces).

meh ... let devs do whatever they want
-mike



Re: [gentoo-dev] Re: RFC: split up media-sound/ category

2011-06-25 Thread Ulrich Mueller
 On Sat, 25 Jun 2011, Maciej Mrozowski wrote:

 Assuming package names are unique identifiers, tags are not
 necessary to be available for ebuild.sh so metadata.xml is the best
 place.

But we know that package names are _not_ unique. There are many cases
in the Portage tree where two or even more packages have the same
name. Categories are there to avoid such collisions.

With multiple overlays/repositories instead of one monolithic Portage
tree, the collision issue gets even worse if you have a flat
namespace.

Ulrich



Re: [gentoo-dev] SHA256 and indention in metadata.xml

2011-06-25 Thread Nirbheek Chauhan
On Sat, Jun 25, 2011 at 10:54 PM, Mike Frysinger vap...@gentoo.org wrote:
 On Sat, Jun 25, 2011 at 10:23, Nirbheek Chauhan wrote:
 On Sat, Jun 25, 2011 at 6:16 PM, justin wrote:
 Another question, do we have a rule, how the metadata.xml has to be
 indented? Tabs or n spaces?

 There's no rule, but we should follow the same rule as ebuilds —
 indentation should be with a tab that's displayed as 4 spaces in
 editors (no expansion of tabs to spaces).

 meh ... let devs do whatever they want


Didn't I just say there's no rule?

-- 
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team



Re: [gentoo-dev] Re: RFC: split up media-sound/ category

2011-06-25 Thread Maciej Mrozowski
On Saturday 25 of June 2011 19:29:58 Ulrich Mueller wrote:
  On Sat, 25 Jun 2011, Maciej Mrozowski wrote:
  Assuming package names are unique identifiers, tags are not
  necessary to be available for ebuild.sh so metadata.xml is the best
  place.
 
 But we know that package names are _not_ unique. There are many cases
 in the Portage tree where two or even more packages have the same
 name. Categories are there to avoid such collisions.

But we also know, that making package names unique is first step to take as I 
already noted in my first post in this thread. It's not that current package 
naming scheme should be an unfixable obstacle preventing us from getting rid 
of pointless categories (yes, every pkgmove in tree renders categories concept 
broken by design, sorry to state this fact brutally).

As far as app-xemacs is concerned (and probably why you commented here), it 
should be sufficient to prepend xemacs- to package names from app-xemacs 
category in order to make them distinguished from the rest.
It would be elegant and correct - after all when you emerge ocaml you don't 
expect to be installing objective caml mode for Emacs, but ocaml interpreter 
itself.

 With multiple overlays/repositories instead of one monolithic Portage
 tree, the collision issue gets even worse if you have a flat
 namespace.

Every not Gentoo-based distro can live with unique package names, somehow 
Gentoo is not able to? Colour me surprised.

Btw, in above, I specifically proposed those unique packages to be placed in 
${PORTDIR}/ebuilds/ because when 'ebuilds' is considered like a fake category' 
- existing atom syntax can be used and so can be current package manager 
implementation (even with not entirely converted package tree, except 
uniqueness is not checked in such case).

-- 
regards
MM


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] SHA256 and indention in metadata.xml

2011-06-25 Thread Jorge Manuel B. S. Vicetto
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 25-06-2011 14:23, Nirbheek Chauhan wrote:
 On Sat, Jun 25, 2011 at 6:16 PM, justin j...@gentoo.org wrote:
 Another question, do we have a rule, how the metadata.xml has to be
 indented? Tabs or n spaces?

 
 There's no rule, but we should follow the same rule as ebuilds —
 indentation should be with a tab that's displayed as 4 spaces in
 editors (no expansion of tabs to spaces).

Talking from my own experience when doing retirement stuff, there seems
to be two large currents on metadata.xml in the tree, using tabs and 2
spaces for indentation.
I personally prefer tabs, but I also like using EAPI=version,
sorting everything alphabetically and even use the following depend blocks:

*DEPEND=
!X-2.0
!Y
A
B
...
Z
a? ( X )
b? ( Y )
c? (
J
K
)


As expected, I'm sure many of the others disagree / dislike at least
part of my preferences.

- -- 
Regards,

Jorge Vicetto (jmbsvicetto) - jmbsvicetto at gentoo dot org
Gentoo- forums / Userrel / Devrel / KDE / Elections / RelEng
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBCAAGBQJOBkXrAAoJEC8ZTXQF1qEPjUgQANBsE0uhZPR0Yqlmh6G4bCpo
F+IvN0PbMcU35tjy87jQ47Y4dCg9mCQftPe1uPt4rtmc1Sww/ztqPdlsXJdi4nRQ
pnVPnJdds39hYzmc5rOjVtsyZOKLH92J7ytVom9AiuO7DqxJvs/A6q/sj46E0KBI
MSUHvSNMH+aq6xGVyQ2lTRAUXUT83bkl3BOrxdPLApgZvteF+fDKHUIviLoQA+wO
VV31Jsav+IIa3KNmxmiF6IoWZFeCLyVlwMJDHp0r23Q28n6qDOoKbWjpwQBwGPXQ
5a/nLKHRTVStzy94gqqCSlNyZso4KjrC5JAeadHiAPisRGloJUWB12UYN/Tm/4CA
KfA4Myvk3Aclr6BGnUQ+DeX+r0hKElHwR60XqkebTt04dcDS1GylV1IpJjpHt8dZ
j2Btz6HdZKzDTRabCyaaOk2UaXAYtN4KjkaWepKHauR73XEtLxs8YY1gc+0T3i4Q
pbjQJfGCP166b/1hS9Evr5/oAcxlDlSRHL0773BowrX/CGpKTDv5bv+9Gm3skiOV
Zd89MomsoV++QUTcXe1i7m6XAYyHkhf9doJl62t5LlflQYE+UIb69HnhdpdHQdfw
km55lo24X4lvxV+nDz26v+fi9mHqlJ4TNxZaQ+6PnvrI4K862biRz+VlsSWcE5ay
1nb/tuwZ0VlfQvUh5TES
=wuC1
-END PGP SIGNATURE-



Re: [gentoo-dev] Re: RFC: split up media-sound/ category

2011-06-25 Thread Kent Fredric
On 26 June 2011 05:29, Ulrich Mueller u...@gentoo.org wrote:
 On Sat, 25 Jun 2011, Maciej Mrozowski wrote:

 Assuming package names are unique identifiers, tags are not
 necessary to be available for ebuild.sh so metadata.xml is the best
 place.

 But we know that package names are _not_ unique. There are many cases
 in the Portage tree where two or even more packages have the same
 name. Categories are there to avoid such collisions.

 With multiple overlays/repositories instead of one monolithic Portage
 tree, the collision issue gets even worse if you have a flat
 namespace.

 Ulrich



At present, this exists because we use categories as a the primary way
for a *user* to find the package. Our current collision avoidance
strategy is targeted at not confusing our *user*.

However, in the proposed strategy, package names themselves are not
*users* primary interface. tags and other metadata are intended as
the users primary interface.

Package names themselves can be thusly arbitrary , and could be a SHA
sum or something obscure, as long as all internals and dependencies
used the same arbitrary name, things would work as intended.

There is one remaining downside to the flat topology however, and that
is it may hamper our move to git.

I was thinking that what could be done is have seperate submodules or
whathave you for various categories to somewhat ease the load of A
full checkout of the tree going back for all time can be bloody huge
and slow , but without categorical subtrees that approach will be
less viable.

Although, this what currently seems like a disadvantage could be used
to an advantage perhaps, with the possible idea of meta trees of
some description. If we relinquish the hold we have on symlinks, a few
interesting options become available.

Different Herds could have their own subtrees in

/projects/herd/x

and a tool could be used to symlink the contents of herd specific
subtrees to the ebuilds folder.

/ebuild/pn   -- /projects/herd/herdname/pn

And the tool can inform herds when they add a new package that
conflicts with the name of an existing one so it can be disambiguated
before the tree propagates to users.

Continuing on that line of thought and you get even more interesting
ideas, like introducing a merge mask file, which allows people to
work on stuff in the herd tree , and indicate that their
files/packages are not fit for integration with the main-tree yet,
somewhat bridging the gap we presently have between Development
overlays and the current main tree.

This could in turn make collaboration even easier, as dev branches
will be able to go nuts with all sorts of random contributions, and
when its deemed fit for public consumption and testing remove the
package from the merge mask and its there.

/early morning coffee fuelled idea session

-- 
Kent

perl -e  print substr( \edrgmaM  SPA NOcomil.ic\\@tfrken\, \$_ * 3,
3 ) for ( 9,8,0,7,1,6,5,4,3,2 );

http://kent-fredric.fox.geek.nz



[gentoo-dev] Thoughts about broken package handling

2011-06-25 Thread Stuart Longland
Hi all,

I've been busy for the past month or two, busy updating some of my
systems.  In particular, the Yeeloong I have, hasn't seen attention in a
very long time.  Soon as I update one part however, I find some swath of
packages break because of a soname change, anything Python-related stops
working because of a move from Python 2.6 to 2.7, or Perl gets updated.

Currently we have three packages that handle this separately:
- revdep-rebuild (handles packages broken by soname changes, etc)
- python-updater (handles Python module rebuilds after upgrading Python)
- perl-cleaner (handles Perl module rebuilds after upgrading Perl)

My bugbear at the moment, is often a package is broken for more than one
reason in my situation, and I find myself having to manhandle the
package lists generated by the above three, building each package
one-by-one, until I manage to rebuild them all.

Or sometimes a package being rebuilt by revdep-rebuild fails because of
a Python module, I'll manually merge that module, then play another
round of Russian Roulette to see which package gets shot down next.

Issues are complicated further when revdep-rebuild or whatever tool,
passes the list to Portage, and it fails to calculate dependencies... I
just had one before where revdep-rebuild failed because there were no
ebuilds to satisfy:

sys-devel/gcc:i686-pc-linux-gnu-4.4.5

I've worked around this by picking up the list generated by
revdep-rebuild (in /var/cache/... ), and using a bash while read loop to
pass each package individually to emerge for building.

How well is this cleanup trio working?  It works, but I think it could
improve.

The thing I see is that all three are fixing essentially the same
problem: package breakage due to a change in the dependencies.  I think
there is scope for a single package, or better yet, Portage extension,
that handles all three cases.

Concept:

Tool will be written in separate modules to handle:
- ELF soname change breakage
- Python module updates
- Perl module updates
- other checks that can cause broken packages...

Each check is run in order, generating a list of packages that should be
rebuilt.

Having generated this list, it is then evaluated to sort the candidate
packages into a suitable order for rebuilding.

This is then passed to the package manager... three modes for rebuilds:
- All-in-one-hit rebuild: What the tools presently do now.
- One-by-one rebuild: For each package in the list, build each one
individually... useful if Portage coughs up an error otherwise
- Dump the list: allows people to handle it with their own tools

I might see if I can rough something up, but that's what I'm thinking
of.  It has been an irritation for me for quite some time.

Thoughts,
-- 
Stuart Longland (aka Redhatter, VK4MSL)  .'''.
Gentoo Linux/MIPS Cobalt and Docs Developer  '.'` :
. . . . . . . . . . . . . . . . . . . . . .   .'.'
http://dev.gentoo.org/~redhatter :.'

I haven't lost my mind...
  ...it's backed up on a tape somewhere.



Re: [gentoo-dev] Re: RFC: split up media-sound/ category

2011-06-25 Thread Wyatt Epp
On Sat, Jun 25, 2011 at 21:47, Kent Fredric kentfred...@gmail.com wrote:
 Package names themselves can be thusly arbitrary , and could be a SHA
 sum or something obscure, as long as all internals and dependencies
 used the same arbitrary name, things would work as intended.

I mentioned this idea of internally referencing packages by a hash in
the other thread.  As long as we're clear that the most common
operation (emerge -av ${PN}) is still exposed to the user, it's
perfectly valid.  I want to be very sure we're clear in our
understanding that tags are for discovery in cases where the user is
not sure what is available (like categories).

As for the latter part, the size of a git repo becoming umanageable
over time had not occurred to me, I'm afraid-- would it work to use
shallow clones?  Otherwise, the herd-wise division is probably
acceptable.  Need to think about that one more.

Regards,
Wyatt



[gentoo-dev] Re: Thoughts about broken package handling

2011-06-25 Thread Duncan
Stuart Longland posted on Sun, 26 Jun 2011 12:59:05 +1000 as excerpted:

 Currently we have three packages that handle this separately:
 - revdep-rebuild (handles packages broken by soname changes, etc)
 - python-updater (handles Python module rebuilds after upgrading Python)
 - perl-cleaner (handles Perl module rebuilds after upgrading Perl)
 
 My bugbear at the moment, is often a package is broken for more than one
 reason in my situation, and I find myself having to manhandle the
 package lists generated by the above three, building each package
 one-by-one, until I manage to rebuild them all.

I've gone thru that once on my netbook, and will likely be doing it again 
every 4-8 months, as I don't keep it as updated as my workstation (which 
I try to update weekly to daily).

At 6-8 months it's doable, but requires patience...  Much beyond that and 
doing a new stage install might well be easier.

 Issues are complicated further when [portage] fails to calculate
 dependencies [due to python breakage, etc]...

 The thing I see is that all three are fixing essentially the same
 problem: package breakage due to a change in the dependencies.  I think
 there is scope for a single package, or better yet, Portage extension,
 that handles all three cases.
 
 Concept:
 
 Tool will be written in separate modules to handle:
 - ELF soname change breakage
 - Python module updates
 - Perl module updates
 - other checks that can cause broken packages...

 [The combined list] is then passed to the package manager...

 three modes for rebuilds:
 - All-in-one-hit [current]
 - One-by-one [if portage chokes on the big list]
 - Dump the list: allows people to handle it with their own tools
 
 I might see if I can rough something up, but that's what I'm thinking
 of.  It has been an irritation for me for quite some time.
 
 Thoughts,

I'm sure most users will find this VERY useful.  I know I will, in no 
small part because while I've integrated revdep-rebuild into my regular 
update routine, the perl and python rebuilders don't get run as 
regularly.  If there was a single tool that could scan all three sets, 
plus be modular enough to expand to others as necessary...

That's even for routine updates.  For the longer term updates, the second 
and third modes would be a HUGE help, particularly as they could allow 
etc-updates or other config and other not-automatically-package-install-
triggered updates at critical points as well, something the first mode 
doesn't really handle.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman




Re: [gentoo-dev] Re: RFC: split up media-sound/ category

2011-06-25 Thread Kent Fredric
On 26 June 2011 15:49, Wyatt Epp wyatt@gmail.com wrote:
 As for the latter part, the size of a git repo becoming umanageable
 over time had not occurred to me, I'm afraid-- would it work to use
 shallow clones?  Otherwise, the herd-wise division is probably
 acceptable.  Need to think about that one more.


  --depth depth
   Create a shallow clone with a history truncated to the specified
   number of revisions. A shallow repository has a number of
   limitations (you cannot clone or fetch from it, nor push from nor
   into it), but is adequate if you are only interested in the recent
   history of a large project with a long history, and would want to
   send in fixes as patches.

It would be ok perhaps for non-contributing users to use shallow
clones, but in my understanding, shallow clones limit you to doing
what you could do with a tar file of the specified revision, which
basically makes it impractical for people who are developing on it,
and would mean every new developer would get a progressively longer
time in order to do a complete check out.

( Unless of course we had some sort of periodic refresh where history
was discarded/rebased into nonexistence , but that is really the same
problem with different faces )

-- 
Kent

perl -e  print substr( \edrgmaM  SPA NOcomil.ic\\@tfrken\, \$_ * 3,
3 ) for ( 9,8,0,7,1,6,5,4,3,2 );

http://kent-fredric.fox.geek.nz



Re: [gentoo-dev] Thoughts about broken package handling

2011-06-25 Thread Benedikt Böhm
On Sun, Jun 26, 2011 at 4:59 AM, Stuart Longland redhat...@gentoo.org wrote:
 - revdep-rebuild (handles packages broken by soname changes, etc)

solved by preserved-libs in portage-2.2

 - python-updater (handles Python module rebuilds after upgrading Python)
 - perl-cleaner (handles Perl module rebuilds after upgrading Perl)

these just exist because python and perl ebuilds are horribly broken.
take a look at RUBY_TARGETS or PHP_TARGETS for an example of how to do
it right. this would also fix all the failures that python and perl
introduce to binary packages.

-Bene



Re: [gentoo-dev] Thoughts about broken package handling

2011-06-25 Thread Kent Fredric
On 26 June 2011 17:44, Benedikt Böhm hol...@gentoo.org wrote:
 these just exist because python and perl ebuilds are horribly broken.
 take a look at RUBY_TARGETS or PHP_TARGETS for an example of how to do
 it right. this would also fix all the failures that python and perl
 introduce to binary packages.

Perl doesn't slot and is presently far too complex to slot it
reliably. Ideally we could have Perl slotted, but the effort involved
is huge at present. If you're willing to contribute patches to solve
this problem you're welcome, but until then, perl-cleaner is really
the best we can do.


-- 
Kent

perl -e  print substr( \edrgmaM  SPA NOcomil.ic\\@tfrken\, \$_ * 3,
3 ) for ( 9,8,0,7,1,6,5,4,3,2 );

http://kent-fredric.fox.geek.nz