[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-12-01 Thread Diego 'Flameeyes' Pettenò
Alec Warner [EMAIL PROTECTED] writes:

 - Space savings.  Certainly your scheme may be smaller, but the XML
 tag overhead may eat into the savings.  You should do some estimates
 to show the community how much smaller the tree will be from this
 proposal.

Sorry but you lost me on any point you might have brought across since
after this I feel like you were trying to put words in my mouth.

Beside, if you really want to go down that road you should be counting
that beside ReiserFS with tail, I don't remember any other Linux FS that
has block smaller than 512bytes, which means that each file in metadata
cache is taking up much more than just its size in characters.

All your math is thus wrong.

-- 
Diego Flameeyes Pettenò
http://blog.flameeyes.eu/


pgpcFFtmtOt8h.pgp
Description: PGP signature


[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-12-01 Thread Diego 'Flameeyes' Pettenò
Jan Kundrát [EMAIL PROTECTED] writes:

 But also the need to replicate http://www.kde.org/ to metadata.xml of
 all KDE split ebuilds -- right now, this is set by an eclass.

The usefulness of this is IMHO debatable; why not just writing it one
package (say kde-base/kde or kde-meta) and just there? Having each
mini-package express itself as having that as its homepage is not very
useful to me, but I guess it's debatable.

 - allows proper handling of packages lacking a HOMEPAGE;

 Could you elaborate a bit about how different is handling of an
 empty/uninitialized shell variable from an empty XML element?

That you can provide _other_ links beside an homepage, like
unmaintained, gentoo:userguide and stuff like that so that user
don't just get no homepage at all, and they are not misdirected by
homepage being http://www.gentoo.org/ or something.

 - users can check the metadata much more easily by just opening the xml
   file or interfacing to that rather than having to skim through the
   ebuild, the xml files are probably more user readable then ebuilds
   using multiple eclasses;

 Haven't we already agreed that accessing ebuilds/... directly is
 broken by design?

For a software sure, but as an user I am automatically brought to just
look at the files if I'm looking for the homepage of a package I know,
and seeing a metadata.xml file I'm more likely to look at that rather
than the metadata cache in /var/db/... .

And it's certainly more user-readable an XML file than HOMEPAGE with
depend-like syntax for labels and conditionals and whatever else seems
like Alec is proposing for EAPI=3

 - webapps like packages.gentoo.org would be able to display basic
   information without having to parse the ebuilds or the metadata cache.

 Except for the ebuilds which still use the old format (that is 100% of
 the tree right now)

This of course is meant as whenever this is fully implemented

-- 
Diego Flameeyes Pettenò
http://blog.flameeyes.eu/


pgpfOxlYEmqMh.pgp
Description: PGP signature


[gentoo-dev] Re: debug/release builds extensions/clarification proposal

2008-12-01 Thread Diego 'Flameeyes' Pettenò
Maciej Mrozowski [EMAIL PROTECTED] writes:

 - USE=debug is useless  when CFLAGS/LDFLAGS or FEATURES are not appropriate

What are you saying here? I'm afraid you're mistaken here.

For the most part, USE=debug means enable debug code paths, which for
lots of projects simply means enable assertions; there are packages
that take this as enable debug symbols too but I don't think that's
very valid since users might want debug code paths but not symbols and
vice-versa (I indeed have debug symbols bug no debug codepaths enabled).

Now just to make sure the common misconceptions don't hit again:

- -ggdb *does not have any runtime performance hit*; neither in
   execution time nor in memory usage; the debug sections are not mapped
   into memory at all; this is true for both non-stripped and split
   executables;
- -O0 is not always a good idea; beside bugs in packages concealed by
   -O1+ [1], there are some further points: missing registers on x86
   causes build failures, and if ( 0 ) cases are not optimised away,
   resulting in stuff like FFmpeg not to link properly since undefined
   references are not pruned away; this means that using -O0
   unconditionally for any package for debug is not really an option;

[1] http://blog.flameeyes.eu/2008/09/02/testing-the-corner-cases

-- 
Diego Flameeyes Pettenò
http://blog.flameeyes.eu/


pgpJ5ioEeo3pE.pgp
Description: PGP signature


Re: [gentoo-dev] debug/release builds extensions/clarification proposal

2008-12-01 Thread Peter Volkov
В Пнд, 01/12/2008 в 06:16 +0100, Maciej Mrozowski пишет:
 Currently handling debug/release builds is incoherent and misleading to say 
 the least. We have got in Gentoo:

All that parts do their separate and quite a different work so I can't
say that it's incoherent (by idea at least).

 The drawbacks are as follows:
 - USE=debug is useless  when CFLAGS/LDFLAGS or FEATURES are not appropriate

USE=debug enables additional debug output or more assertions in the
code. It's hard to tell in advance in details what USE=debug does since
different packages enable different things. But generally it adds
additional code with -DDEBUG and this is independent of CFLAGS/LDFLAGS.
If you know packages where this is not true, fill bugs on them.

 - CFLAGS/LDFLAGS must be set globally when they are about to be supported
 - those who don't want to set them globally, they are forced to use (very 
 flexible and great indeed) /etc/portage/env hack - which is undocumented and 
 unsupported, because everything user set there, is not shown by emerge 
 --info, 
 thus bug reports from such machines  are not taken into consideration, as 
 virtually everything that breaks can be there

This leads me to different conclusion. I was thinking about new portage
feature: emerge --info pkg . So to make portage show not only global
information but per-package either. In many cases this will simplify
analyzing of the problem.

 - too much choice leads to confusion

That's always true. But we use Gentoo because we enjoy our freedom to
choose... Rigth? :)

 Implementation is trivial - eclass would be responsible for handling 
 USE=debug 
 flag, when debug is set:
 - replace CFLAGS with CFLAGS_DEBUG, LDFLAGS with LDFLAGS_DEBUG and possibly 
 others
 - replace FEATURES with FEATURES_DEBUG

USE flags should never change {C,LD}FLAGS or FEATURES as they are
different things and such relation between USE flags, {C,LD}FLAGS and/or
FEATURES will lead to even more confusion. (also there is complexity
Duncan told you about...)

Personally to get build with symbols I use a trivial wrapper around
emerge:

demerge() {
  env USE=debug CFLAGS=-O2 -pipe -g -ggdb PKGDIR=/vt/binpkg-debug \
  FEATURES=buildpkg splitdebug collision-protect ccache noclean 
installsources \
emerge $@
}

and I use demerge whenever I need to debug package. I'm sure this is
just a quick hack which could be greatly improved to track which
packages are installed with or without symbols. But you got an idea:
such thing are better to do with separate, but very tiny and simple
wrappers.

P.S. I remember most of this was already discussed in this mailing list.
Try to search it and you'll find much more ideas and motivations.

-- 
Peter.




Re: [gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-12-01 Thread Alec Warner
On Mon, Dec 1, 2008 at 12:24 AM, Diego 'Flameeyes' Pettenò
[EMAIL PROTECTED] wrote:
 Alec Warner [EMAIL PROTECTED] writes:

 - Space savings.  Certainly your scheme may be smaller, but the XML
 tag overhead may eat into the savings.  You should do some estimates
 to show the community how much smaller the tree will be from this
 proposal.

 Sorry but you lost me on any point you might have brought across since
 after this I feel like you were trying to put words in my mouth.

Sorry for that, I never meant to imply that you said space savings.

That being said I still don't see the usefulness here.

You seem to think that using the existing APIs for this data is wrong,
and I think the opposite, so I guess we will agree to disagree on this
matter.


 Beside, if you really want to go down that road you should be counting
 that beside ReiserFS with tail, I don't remember any other Linux FS that
 has block smaller than 512bytes, which means that each file in metadata
 cache is taking up much more than just its size in characters.

 All your math is thus wrong.

As was pointed out on IRC, UTF8 characters are not a fixed size,
making my math even more wrong ;)


 --
 Diego Flameeyes Pettenò
 http://blog.flameeyes.eu/



[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future

2008-12-01 Thread Diego 'Flameeyes' Pettenò
Alec Warner [EMAIL PROTECTED] writes:

 That being said I still don't see the usefulness here.

 You seem to think that using the existing APIs for this data is wrong,
 and I think the opposite, so I guess we will agree to disagree on this
 matter.

Yeah I still think that there is no point in requiring using of a
specific API when the same data can easily be available in a format that
is more or less parsable with ease in any modern (and non) programming
language.

Beside, I find expanding the HOMEPAGE syntax to allow more than one link
a bit ... overkill, if the same thing can be achieved in metadata.xml...

 Beside, if you really want to go down that road you should be counting
 that beside ReiserFS with tail, I don't remember any other Linux FS that
 has block smaller than 512bytes, which means that each file in metadata
 cache is taking up much more than just its size in characters.

 All your math is thus wrong.

 As was pointed out on IRC, UTF8 characters are not a fixed size,
 making my math even more wrong ;)

If we consider HOMEPAGE, the assumption that characters are fixed size
to 1 byte is good enough; URLs are usually encoded in pure ascii
character space for compatibility; while IDN would break that
assumption, we can't even assume that IDN is always available and so on.

For description maybe it's different because there is space there for
UTF-8 characters, but that's going to bring us even farthest than the
point.

-- 
Diego Flameeyes Pettenò
http://blog.flameeyes.eu/


pgpfg98QhGFq3.pgp
Description: PGP signature


Re: [gentoo-dev] Re: debug/release builds extensions/clarification proposal

2008-12-01 Thread Maciej Mrozowski
On Monday 01 of December 2008 08:04:04 Duncan wrote:

Well, so far it's not GLEP, just an idea thrown to brainstorm.

 As such, neither /etc/portage/env nor eclasses can effectively deal with
 FEATURES in general, tho there are a few specific exceptions that do
 happen to be implemented at the bash level.

Those exceptions are nostrip and splitdebug at least, besides I intend to keep 
it bash (or ebuild) level only - to preserve simplicity and yet functionality. 
FEATURES_DEBUG was a clean and convenient approach of me being unaware of 
FEATURES internals - thanks for clarification. FEATURES little inconsistency 
problem needs to be addressed. The goal is to have only one, determined and 
always working way of not-stripping symbols. Of course it can be easily 
handled in eclass by something like this:

if use debug; then
   FEATURES=${FEATURES//splitdebug//}
   FEATURES=${FEATURES//nostrip//}
   FEATURES=${FEATURES} ${PREFERRED_NOSTRIP_METHOD}

Dzwon tanio do wszystkich!
Sprawdz  http://link.interia.pl/f1fa7




[gentoo-dev] Re: [RFC] Saving package emerge output (einfo, elog, ewarn, etc.) somewhere official

2008-12-01 Thread Christian Faulhammer
Hi,

Dale [EMAIL PROTECTED]:
 If you have a GUI on your system, give this a look: 
 app-portage/elogviewer  That should help you a lot.  I been using it
 for a good while and it works pretty well.  I do wish it had little
 flags in the list of packages that have been installed.  Sort of a
 short and sweet  notice there is something there without actually
 have to look. Maybe a red flag when there is something really serious
 to know and other colors for other things.

 app-portage/elogv (ncurses) and app-portage/kelogviewer (Qt based) are
really nice, too.  Unfortunately the two GUI variants are homeless, so
improvements won't happen from the original upstream.

V-Li

-- 
Christian Faulhammer, Gentoo Lisp project
URL:http://www.gentoo.org/proj/en/lisp/, #gentoo-lisp on FreeNode

URL:http://www.faulhammer.org/


signature.asc
Description: PGP signature


Re: [gentoo-dev] Re: debug/release builds extensions/clarification proposal

2008-12-01 Thread Maciej Mrozowski
On Monday 01 of December 2008 09:36:12 Diego 'Flameeyes' Pettenò wrote:
  - USE=debug is useless  when CFLAGS/LDFLAGS or FEATURES are not
  appropriate
 What are you saying here? I'm afraid you're mistaken here.

The point is to look at this from users' (well, a bit) point of view - 
USE=debug variable is ambiguous in it's meaning. While it enables only 
codepaths (asserts, #ifdefs and similar) it suggests (by name and for some 
packages not only suggests) enabling debug symbols.
And policy is to enforce CFLAGS from make.conf and wipe out every package-
defined flags as far as I know.

 For the most part, USE=debug means enable debug code paths, which for
 lots of projects simply means enable assertions; there are packages
 that take this as enable debug symbols too but I don't think that's
 very valid since users might want debug code paths but not symbols and
 vice-versa (I indeed have debug symbols bug no debug codepaths enabled).

That's correct, the problem is - Gentoo does not provide officially supported 
mechanism of enabling both or just debug symbols per package basis - it 
doesn't even provide any supported/documented mechanism for per package 
CFLAGS, FEATURES and similar.
If /etc/portage/env hack/feature could be made official (for CFLAGS,LDFLAGS 
and bash-domain FEATURES) - it could address this issue good enough, because 
with proper smart combination of symlinks/files the ultimate configuration 
power would be delivered, not just cleaning/workaround I am actually 
proposing. Per package debug/release/profile/or_any_other configuration is 
what I would pursue, and in my proposal I used USE=debug as existing and 
supported way of achieving this.

While I don't like hack @pve uses (I prefer portage/env as more convenient 
way), his idea about emerge --info pkg seems interesting.

 - -ggdb *does not have any runtime performance hit*; neither in

Yes, I'm well aware of that, though it increases disk space requirements a bit 
as it's applied to all libs/bins.

 - -O0 is not always a good idea; beside bugs in packages concealed by
-O1+ [1]

[1] is a pathology and should be fought against, -O1+ may leave frame stack 
useless for debugging due to inline optimizations in some places (especially 
debugging inline class implementations is limited, which affects Qt/KDE) - 
besides - I may not stated it clear - those default values would be defined in 
the very same make.conf, so it could be:

CHOST=x86_64-pc-linux-gnu  
CFLAGS=-march=nocona -O2 -pipe -msse3 -ftree-vectorize
CXXFLAGS=${CFLAGS}

CFLAGS_DEBUG=-O2 -ggdb

Yet, I still cannot think of this proposal other way like of dirty workaround 
for the problem, that doesn't really exist (well, at least for developers, who 
have meta-distribution and ultimate freedom for user in mind).  For the users 
the problem is real, of course it's usually a consequence of either not being 
aware of those mechanisms or as a result of ambiguous semantics of USE=debug.

And what about pushing some bash-domain FEATURES to USE flags? Like nostrip, 
splitdebug? I guess being able to set it per package is important.

-- 
regards
MM


signature.asc
Description: This is a digitally signed message part.


[gentoo-dev] Monthly Gentoo Council Reminder for December

2008-12-01 Thread Mike Frysinger
This is your monthly friendly reminder !  Same bat time (typically
the 2nd Thursday at 2000 UTC / 1600 EST), same bat channel
(#gentoo-council @ irc.freenode.net) !

If you have something you'd wish for us to chat about, maybe even
vote on, let us know !  Simply reply to this e-mail for the whole
Gentoo dev list to see.

Keep in mind that every GLEP *re*submission to the council for review
must first be sent to the gentoo-dev mailing list 7 days (minimum)
before being submitted as an agenda item which itself occurs 7 days
before the meeting.  Simply put, the gentoo-dev mailing list must be
notified at least 14 days before the meeting itself.

For more info on the Gentoo Council, feel free to browse our homepage:
http://www.gentoo.org/proj/en/council/



Re: [gentoo-dev] Monthly Gentoo Council Reminder for December

2008-12-01 Thread Ciaran McCreesh
On 01 Dec 2008 05:30:01
Mike Frysinger [EMAIL PROTECTED] wrote:
 If you have something you'd wish for us to chat about, maybe even
 vote on, let us know !  Simply reply to this e-mail for the whole
 Gentoo dev list to see.

Please give the OK on the following, assuming no objections crop up
before then:

* [RFC] Label profiles with EAPI for compatibility checks (revised)
  http://archives.gentoo.org/gentoo-dev/msg_930f58fcebcbbcbe523c001f2c825179.xml

* EAPI change: Call ebuild functions from trusted working directory
  http://archives.gentoo.org/gentoo-dev/msg_5ba467bbd5a0820e040210683702a67f.xml

* RFC: DEFINED_PHASES magic metadata variable
  http://archives.gentoo.org/gentoo-dev/msg_8c34d8efbc0d31ab28c517403dc83f62.xml

-- 
Ciaran McCreesh


signature.asc
Description: PGP signature


Re: [gentoo-dev] Jeeves IRC replacement now alive - Willikins

2008-12-01 Thread Jeremy Olexa
On Wed, Aug 6, 2008 at 3:18 PM, Robin H. Johnson [EMAIL PROTECTED] wrote:

 Getting the bot out there
 -
 If you would like to have the new bot in your #gentoo-* channel, would
 each channel founder/leader please respond to this thread, stating the
 channel name, and that they are the contact for any problems/troubles.

Hi,
#gentoo-prefix please. I am the channel founder and am available on
irc for 'issues'
Thanks,
Jeremy



Re: [gentoo-dev] debug/release builds extensions/clarification proposal

2008-12-01 Thread Marius Mauch
On Mon, 01 Dec 2008 11:39:35 +0300
Peter Volkov [EMAIL PROTECTED] wrote:

 This leads me to different conclusion. I was thinking about new
 portage feature: emerge --info pkg . So to make portage show not
 only global information but per-package either. In many cases this
 will simplify analyzing of the problem.

That feature already exists (for installed packages at least).

Marius



Re: [gentoo-dev] Re: debug/release builds extensions/clarification proposal

2008-12-01 Thread Maciej Mrozowski
On Monday 01 of December 2008 08:04:04 Duncan wrote:
 (Of
 course, if it's the latter, it will need to be an official GLEP, and
 you'll have three separate package managers and their developers to push
 the proposal thru to at least to general agreement, or the council will
 almost certainly reject the GLEP, if it gets even that far.)

That I found interesting - what does any 3rd party package manager to do with 
setting policies and enhancements regarding official Gentoo package manager? 
Have you ever heard of liberum veto? But that's an off topic of course.

-- 
regards
MM


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] [RFC] Saving package emerge output (einfo, elog, ewarn, etc.) somewhere official

2008-12-01 Thread Gilles Dartiguelongue
Summarizing from what I've read in this thread it seems you want to find
a way to help user find information s/he doesn't look for. 

If users aren't curious about their system they will sure have a hard
time figuring out how to fix it if needs be. PORTAGE_ELOG_* isn't really
that hard to find in the make.conf.example (even though it's new
location makes it a bit harder to find).

As others have said, there are already proper systems, documentation and
linking through other docs. Not finding this is what I'd call lazyness
or lack of google foo. Don't misunderstand me, some stuff can get ouf of
the radar of everyone, it's ok, real people are still here to point you
in the right direction.

If you find a better way to convey these information to the users, then
please surprise me. For now I think we are in a good shape.

-- 
Gilles Dartiguelongue [EMAIL PROTECTED]
Gentoo


signature.asc
Description: Ceci est une partie de message numériquement signée


Re: [gentoo-dev] [RFC] Saving package emerge output (einfo, elog, ewarn, etc.) somewhere official

2008-12-01 Thread Joe Peterson
Gilles Dartiguelongue wrote:
 As others have said, there are already proper systems, documentation and
 linking through other docs. Not finding this is what I'd call lazyness
 or lack of google foo. Don't misunderstand me, some stuff can get ouf of
 the radar of everyone, it's ok, real people are still here to point you
 in the right direction.

I think that I probably did not express my idea as well as I could have, since
most of the responses I have gotten have echoed your thoughts that Gentoo
does, indeed, have the facilities to achieve flexibility in logging, etc.

I totally agree.  Gentoo's capabilities, although not perfect, of course, are
superlative and are a complement to its superb online doc.  I think that's a
big reason why we're all here - we see this and appreciate this.  In fact,
even when I do not include the word gentoo in a Google search, I more often
than not end up at a Gentoo doc page - this is impressive.

However, what I see as perhaps a missing piece is more conceptual: the
important connection between the valuable info in the emerge logs (and their
somewhat transient default nature) and what a user looks for when he/she has a
problem with a package.  Yes, users will realize this as they use Gentoo (and
will start paying more attention to logs as a result), so I don't think it's a
huge problem, but what this particular user said to me made me think that
there is, perhaps, an opportunity to improve the situation.

There is no Gentoo-specific readme facility, which could be the obvious and
de facto place to go when trouble is had.  I can imagine that a fairly simple
and low-effort way of starting such a resource would be to simple echo the log
output into a package-specific file in a known place (or put it in the portage
db).  The logging facilities allow similar things if configured to do it, but
it is not on by default.  Once users know where to go to see the
instructions or notes on getting a package up and running after
installation, this would become a good place to have such info or to expand on
how the facility works.  Starting with just the plain emerge log output would
be an easy way to get benefit of such a concept has merit.  And by no means
would such a thing be an attempt to replace the excellent on-line docs or
wiki, either - I see both as having unique strengths.  For example, for
detailed info on packages, the wiki/web stuff is the better resource.  For a
quick check of whether a revdep-rebuild might have been necessary after
installing a new package would typically be in the log/notes.  The notes also
have the key advantage that they would *always* contain what the log output
was, whereas whether a wiki or web page exists on a particular package depends
on whether someone spent the time to author one.

My intention with the RFC was to see if the concept has any worth and to kick
it around a bit.  I do not really see this as a deficiency in Gentoo's
technology (which I have a feeling is how many here have interpreted it), but
simply something that, if done correctly, could be useful.

-Joe



Re: [gentoo-dev] [RFC] Saving package emerge output (einfo, elog, ewarn, etc.) somewhere official

2008-12-01 Thread Marius Mauch
On Mon, 01 Dec 2008 15:35:32 -0700
Joe Peterson [EMAIL PROTECTED] wrote:

 My intention with the RFC was to see if the concept has any worth and
 to kick it around a bit.  I do not really see this as a deficiency in
 Gentoo's technology (which I have a feeling is how many here have
 interpreted it), but simply something that, if done correctly, could
 be useful.

Maybe provide a real example to demonstrate the difference between the
current solutions and what you're looking for, because I still don't
understand what you're after (using all the different terms, logs,
notes, docs, plain emerge log, ... without further explanation doesn't
help much to clear things up).

Marius



Re: [gentoo-portage-dev] Re: search functionality in emerge

2008-12-01 Thread Emma Strubell
I completely forgot about Google's Summer of Code! Thanks for reminding me.
Hopefully I won't forget again by the time summer rolls around, obviously I
wouldn't mind getting a little extra money for doing something I'd do for
free anyway.

On a more related note: What, exactly, does porttree.py do? And am I correct
in thinking that my suffix tree(s) should somewhat replace porttree.py? Or,
should I be using porttree.py in order to populate my tree? I think I have
the suffix tree sufficiently figured out, I'm just trying to determine
where, exactly, the tree will fit in to the portage code, and what the best
way to populate it (with package names and some corresponding metadata)
would be.

On Mon, Dec 1, 2008 at 2:34 AM, Duncan [EMAIL PROTECTED] wrote:

 Emma Strubell [EMAIL PROTECTED] posted
 [EMAIL PROTECTED], excerpted
 below, on  Sun, 30 Nov 2008 18:42:11 -0500:

  i am really
  interested in contributing to Gentoo and portage in the future, though.
  I'm thinking this summer I'll have a chance...

 FWIW, Gentoo usually participates in the Google Summer of Code.  Assuming
 they have it again next year, if you're already considering spending some
 time on Gentoo code this summer, might as well try to get paid a little
 something for it.  It could/should be a nice resume booster, too. =:^)

 --
 Duncan - List replies preferred.   No HTML msgs.
 Every nonfree program has a lord, a master --
 and if you use the program, he is your master.  Richard Stallman





Re: [gentoo-portage-dev] Re: search functionality in emerge

2008-12-01 Thread Zac Medico
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Emma Strubell wrote:
 I completely forgot about Google's Summer of Code! Thanks for reminding me.
 Hopefully I won't forget again by the time summer rolls around, obviously I
 wouldn't mind getting a little extra money for doing something I'd do for
 free anyway.
 
 On a more related note: What, exactly, does porttree.py do? And am I correct
 in thinking that my suffix tree(s) should somewhat replace porttree.py? Or,
 should I be using porttree.py in order to populate my tree?

You should use portree.py to populate it. Specifically, you should
use portdbapi.aux_get() calls to access the package metadata that
you'll need, similar to how the code in the existing search class
accesses it.

 I think I have
 the suffix tree sufficiently figured out, I'm just trying to determine
 where, exactly, the tree will fit in to the portage code, and what the best
 way to populate it (with package names and some corresponding metadata)
 would be.

There are there possible times that I imagine a person might want to
populate it:

1) Automatically after emerge --sync. This should not be mandatory
since it will be somewhat time consuming and some users are very
sensitive about --sync time. Note that FEATURES=metadate-transfer is
disabled by default in the latest versions of portage, specifically
to reduce --sync time.

2) On demand, when emerge --search is invoked. The calling user will
need appropriate file system permissions in order to update the
search index.

3) On request, by calling a command that is specifically designed to
generate the search index. This could be a subcommand of emaint.

For the index file format, it would be simplest to use a python
pickle file, but you might choose another format if you'd like the
index to be accessible without python and the portage API (probably
not necessary).
- --
Thanks,
Zac
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.9 (GNU/Linux)

iEYEARECAAYFAkk0JFAACgkQ/ejvha5XGaONDACgixnmCh9Ei6MyUGIZXpiFt7F2
gqMAoOhf5H2uZHB7xhjecOcL0G3w/cqR
=hFNz
-END PGP SIGNATURE-



Re: [gentoo-portage-dev] Re: search functionality in emerge

2008-12-01 Thread Emma Strubell
Thanks for the clarification. I was planning on forcing an update of the
index as a part of emerge --sync, and implementing a command that would
update the search index (leaving it up to the user to update after making
any manual changes to the portage tree). That way the search index should
always be up-to-date when emerge -s is called. It does make sense for the
update upon --sync to be optional, but I guess I don't see why the update
should always be SO slow. Of course the first population of the tree will
take quite a while, but assuming regular (daily?) --syncs (and therefore
updates to the index), subsequent updates shouldn't take very long, since
there will only be a few (hundred?) changes to be made to the tree.

And I do plan on using a pickling the search tree :]

Emma

On Mon, Dec 1, 2008 at 12:52 PM, Zac Medico [EMAIL PROTECTED] wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Emma Strubell wrote:
  I completely forgot about Google's Summer of Code! Thanks for reminding
 me.
  Hopefully I won't forget again by the time summer rolls around, obviously
 I
  wouldn't mind getting a little extra money for doing something I'd do for
  free anyway.
 
  On a more related note: What, exactly, does porttree.py do? And am I
 correct
  in thinking that my suffix tree(s) should somewhat replace porttree.py?
 Or,
  should I be using porttree.py in order to populate my tree?

 You should use portree.py to populate it. Specifically, you should
 use portdbapi.aux_get() calls to access the package metadata that
 you'll need, similar to how the code in the existing search class
 accesses it.

  I think I have
  the suffix tree sufficiently figured out, I'm just trying to determine
  where, exactly, the tree will fit in to the portage code, and what the
 best
  way to populate it (with package names and some corresponding metadata)
  would be.

 There are there possible times that I imagine a person might want to
 populate it:

 1) Automatically after emerge --sync. This should not be mandatory
 since it will be somewhat time consuming and some users are very
 sensitive about --sync time. Note that FEATURES=metadate-transfer is
 disabled by default in the latest versions of portage, specifically
 to reduce --sync time.

 2) On demand, when emerge --search is invoked. The calling user will
 need appropriate file system permissions in order to update the
 search index.

 3) On request, by calling a command that is specifically designed to
 generate the search index. This could be a subcommand of emaint.

 For the index file format, it would be simplest to use a python
 pickle file, but you might choose another format if you'd like the
 index to be accessible without python and the portage API (probably
 not necessary).
 - --
 Thanks,
 Zac
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.9 (GNU/Linux)

 iEYEARECAAYFAkk0JFAACgkQ/ejvha5XGaONDACgixnmCh9Ei6MyUGIZXpiFt7F2
 gqMAoOhf5H2uZHB7xhjecOcL0G3w/cqR
 =hFNz
 -END PGP SIGNATURE-




Re: [gentoo-portage-dev] Re: search functionality in emerge

2008-12-01 Thread Tambet
I would suggest a different way of updates. When you manually change portage
tree, you have to make an overlay. Overlay, as it's updated and managed by
human being, will be always small (unless someone makes a script, which
creates million overlay updates, but I dont think it would be efficient way
to do anything). So, when you search, you can search Portage tree with
index, which is updated with --sync and then search overlay, which is small
and fast to search anyway. Overlay should not have index in such case. If
anyone is going to change portage tree by hand, those changes will be lost
with next --sync and thus noone should do it anyway - this case should not
be considered at all.

Tambet - technique evolves to art, art evolves to magic, magic evolves to
just doing.


2008/12/1 Emma Strubell [EMAIL PROTECTED]

 Thanks for the clarification. I was planning on forcing an update of the
 index as a part of emerge --sync, and implementing a command that would
 update the search index (leaving it up to the user to update after making
 any manual changes to the portage tree). That way the search index should
 always be up-to-date when emerge -s is called. It does make sense for the
 update upon --sync to be optional, but I guess I don't see why the update
 should always be SO slow. Of course the first population of the tree will
 take quite a while, but assuming regular (daily?) --syncs (and therefore
 updates to the index), subsequent updates shouldn't take very long, since
 there will only be a few (hundred?) changes to be made to the tree.

 And I do plan on using a pickling the search tree :]

 Emma


 On Mon, Dec 1, 2008 at 12:52 PM, Zac Medico [EMAIL PROTECTED] wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Emma Strubell wrote:
  I completely forgot about Google's Summer of Code! Thanks for reminding
 me.
  Hopefully I won't forget again by the time summer rolls around,
 obviously I
  wouldn't mind getting a little extra money for doing something I'd do
 for
  free anyway.
 
  On a more related note: What, exactly, does porttree.py do? And am I
 correct
  in thinking that my suffix tree(s) should somewhat replace porttree.py?
 Or,
  should I be using porttree.py in order to populate my tree?

 You should use portree.py to populate it. Specifically, you should
 use portdbapi.aux_get() calls to access the package metadata that
 you'll need, similar to how the code in the existing search class
 accesses it.

  I think I have
  the suffix tree sufficiently figured out, I'm just trying to determine
  where, exactly, the tree will fit in to the portage code, and what the
 best
  way to populate it (with package names and some corresponding metadata)
  would be.

 There are there possible times that I imagine a person might want to
 populate it:

 1) Automatically after emerge --sync. This should not be mandatory
 since it will be somewhat time consuming and some users are very
 sensitive about --sync time. Note that FEATURES=metadate-transfer is
 disabled by default in the latest versions of portage, specifically
 to reduce --sync time.

 2) On demand, when emerge --search is invoked. The calling user will
 need appropriate file system permissions in order to update the
 search index.

 3) On request, by calling a command that is specifically designed to
 generate the search index. This could be a subcommand of emaint.

 For the index file format, it would be simplest to use a python
 pickle file, but you might choose another format if you'd like the
 index to be accessible without python and the portage API (probably
 not necessary).
 - --
 Thanks,
 Zac
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.9 (GNU/Linux)

 iEYEARECAAYFAkk0JFAACgkQ/ejvha5XGaONDACgixnmCh9Ei6MyUGIZXpiFt7F2
 gqMAoOhf5H2uZHB7xhjecOcL0G3w/cqR
 =hFNz
 -END PGP SIGNATURE-





Re: [gentoo-portage-dev] Re: search functionality in emerge

2008-12-01 Thread Emma Strubell
Good point. I may just ignore overlays completely because 1) I don't use
them and 2) does anyone really need to search an overlay anyway? aren't any
packages added via an overlay added deliberately?

On Mon, Dec 1, 2008 at 4:52 PM, Tambet [EMAIL PROTECTED] wrote:

 I would suggest a different way of updates. When you manually change
 portage tree, you have to make an overlay. Overlay, as it's updated and
 managed by human being, will be always small (unless someone makes a script,
 which creates million overlay updates, but I dont think it would be
 efficient way to do anything). So, when you search, you can search Portage
 tree with index, which is updated with --sync and then search overlay, which
 is small and fast to search anyway. Overlay should not have index in such
 case. If anyone is going to change portage tree by hand, those changes will
 be lost with next --sync and thus noone should do it anyway - this case
 should not be considered at all.

 Tambet - technique evolves to art, art evolves to magic, magic evolves to
 just doing.


 2008/12/1 Emma Strubell [EMAIL PROTECTED]

 Thanks for the clarification. I was planning on forcing an update of the
 index as a part of emerge --sync, and implementing a command that would
 update the search index (leaving it up to the user to update after making
 any manual changes to the portage tree). That way the search index should
 always be up-to-date when emerge -s is called. It does make sense for the
 update upon --sync to be optional, but I guess I don't see why the update
 should always be SO slow. Of course the first population of the tree will
 take quite a while, but assuming regular (daily?) --syncs (and therefore
 updates to the index), subsequent updates shouldn't take very long, since
 there will only be a few (hundred?) changes to be made to the tree.

 And I do plan on using a pickling the search tree :]

 Emma


 On Mon, Dec 1, 2008 at 12:52 PM, Zac Medico [EMAIL PROTECTED] wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Emma Strubell wrote:
  I completely forgot about Google's Summer of Code! Thanks for reminding
 me.
  Hopefully I won't forget again by the time summer rolls around,
 obviously I
  wouldn't mind getting a little extra money for doing something I'd do
 for
  free anyway.
 
  On a more related note: What, exactly, does porttree.py do? And am I
 correct
  in thinking that my suffix tree(s) should somewhat replace porttree.py?
 Or,
  should I be using porttree.py in order to populate my tree?

 You should use portree.py to populate it. Specifically, you should
 use portdbapi.aux_get() calls to access the package metadata that
 you'll need, similar to how the code in the existing search class
 accesses it.

  I think I have
  the suffix tree sufficiently figured out, I'm just trying to determine
  where, exactly, the tree will fit in to the portage code, and what the
 best
  way to populate it (with package names and some corresponding metadata)
  would be.

 There are there possible times that I imagine a person might want to
 populate it:

 1) Automatically after emerge --sync. This should not be mandatory
 since it will be somewhat time consuming and some users are very
 sensitive about --sync time. Note that FEATURES=metadate-transfer is
 disabled by default in the latest versions of portage, specifically
 to reduce --sync time.

 2) On demand, when emerge --search is invoked. The calling user will
 need appropriate file system permissions in order to update the
 search index.

 3) On request, by calling a command that is specifically designed to
 generate the search index. This could be a subcommand of emaint.

 For the index file format, it would be simplest to use a python
 pickle file, but you might choose another format if you'd like the
 index to be accessible without python and the portage API (probably
 not necessary).
 - --
 Thanks,
 Zac
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.9 (GNU/Linux)

 iEYEARECAAYFAkk0JFAACgkQ/ejvha5XGaONDACgixnmCh9Ei6MyUGIZXpiFt7F2
 gqMAoOhf5H2uZHB7xhjecOcL0G3w/cqR
 =hFNz
 -END PGP SIGNATURE-






Re: [gentoo-portage-dev] Re: search functionality in emerge

2008-12-01 Thread René 'Necoro' Neumann
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Emma Strubell schrieb:
 2) does anyone really need to search an overlay anyway?

Of course. Take large (semi-)official overlays like sunrise. They can
easily be seen as a second portage tree.
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkk0YpEACgkQ4UOg/zhYFuD3jQCdG/ChDmyOncpgUKeMuqDxD1Tt
0mwAn2FXskdEAyFlmE8shUJy7WlhHr4S
=+lCO
-END PGP SIGNATURE-



Re: [gentoo-portage-dev] Time to say goodbye

2008-12-01 Thread Ned Ludd

On Sun, 2008-11-30 at 16:19 +0100, Marius Mauch wrote:
 So, time has come for me to realize that my time with Gentoo is over. I
 haven't actually been doing much Gentoo work over the last months due
 to personal reasons (nothing Gentoo related), and I don't see that
 situation changing in the near future. In fact I've already reassigned
 or dropped most of my responsibilites in Gentoo a while ago, so there
 are just a few pet projects left to give away:
 - my gentoo-stats project (in the portage/gentoo-stats svn repository).
 I know quite a few people are interested in the idea of collecting
 various statistic data from gentoo user systems, and I'd encourage
 everyone who wants to implement such a system to at least look at it (I
 may have even finished it if I wouldn't have wasted my time focusing on
 the wrong problems). There is quite a bit of documentation also that
 should help to get you started
 - a graphical security update tool (see bug #190397)
 
 So if anyone wants to adopt those, complete or just parts, just take
 them. As for Portage, Zac has practically already filled my role.
 
 So I guess that wraps it up. It's been a nice ride most of the time,
 but now it's time for me to leave the Gentoo train.
 
 Marius

I will always remember you as the guy who provided us with the much
needed glsa*.py (thank you again)
Take care and I wish you the best in all your future endeavors.



-- 
Ned Ludd [EMAIL PROTECTED]
Gentoo Linux




Re: [gentoo-portage-dev] Re: search functionality in emerge

2008-12-01 Thread Tambet
2008/12/2 Emma Strubell [EMAIL PROTECTED]

 True, true. Like I said, I don't really use overlays, so excuse my
 igonrance.


Do you know an order of doing things:

Rules of Optimization:

   - Rule 1: Don't do it.
   - Rule 2 (for experts only): Don't do it yet.

What this actually means - functionality comes first. Readability comes
next. Optimization comes last. Unless you are creating a fancy 3D engine for
kung fu game.

If you are going to exclude overlays, you are removing functionality - and,
indeed, absolutely has-to-be-there functionality, because noone would
intuitively expect search function to search only one subset of packages,
however reasonable this subset would be. So, you can't, just can't, add this
package into portage base - you could write just another external search
package for portage.

I looked this code a bit and:
Portage's __init__.py contains comment # search functionality. After
this comment, there is a nice and simple search class.
It also contains method def action_sync(...), which contains
synchronization stuff.

Now, search class will be initialized by setting up 3 databases - porttree,
bintree and vartree, whatever those are. Those will be in self._dbs array
and porttree will be in self._portdb.

It contains some more methods:
_findname(...) will return result of self._portdb.findname(...) with same
parameters or None if it does not exist.
Other methods will do similar things - map one or another method.
execute will do the real search...
Now - for package in self.portdb.cp_all() is important here ...it
currently loops over whole portage tree. All kinds of matching will be done
inside.
self.portdb obviously points to porttree.py (unless it points to fake tree).
cp_all will take all porttrees and do simple file search inside. This method
should contain optional index search.

self.porttrees = [self.porttree_root] + \
[os.path.realpath(t) for t in 
self.mysettings[PORTDIR_OVERLAY].split()]

So, self.porttrees contains list of trees - first of them is root, others
are overlays.

Now, what you have to do will not be harder just because of having overlay
search, too.

You have to create method def cp_index(self), which will return dictionary
containing package names as keys. For oroot... will be self.porttrees[1:],
not self.porttrees - this will only search overlays. d = {} will be
replaced with d = self.cp_index(). If index is not there, old version will
be used (thus, you have to make internal porttrees variable, which contains
all or all except first).

Other methods used by search are xmatch and aux_get - first used several
times and last one used to get description. You have to cache results of
those specific queries and make them use your cache - as you can see, those
parts of portage are already able to use overlays. Thus, you have to put
your code again in beginning of those functions - create index_xmatch and
index_aux_get methods, then make those methods use them and return their
results unless those are None (or something other in case none is already
legal result) - if they return None, old code will be run and do it's job.
If index is not created, result is None. In index_** methods, just check if
query is what you can answer and if it is, then answer it.

Obviously, the simplest way to create your index is to delete index, then
use those same methods to query for all nessecary information - and fastest
way would be to add updating index directly into sync, which you could do
later.

Please, also, make those commands to turn index on and off (last one should
also delete it to save disk space). Default should be off until it's fast,
small and reliable. Also notice that if index is kept on hard drive, it
might be faster if it's compressed (gz, for example) - decompressing takes
less time and more processing power than reading it fully out.

Have luck!

-BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Emma Strubell schrieb:
  2) does anyone really need to search an overlay anyway?

 Of course. Take large (semi-)official overlays like sunrise. They can
 easily be seen as a second portage tree.
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.9 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

 iEYEARECAAYFAkk0YpEACgkQ4UOg/zhYFuD3jQCdG/ChDmyOncpgUKeMuqDxD1Tt
 0mwAn2FXskdEAyFlmE8shUJy7WlhHr4S
 =+lCO
 -END PGP SIGNATURE-

 On Mon, Dec 1, 2008 at 5:17 PM, René 'Necoro' Neumann [EMAIL 
 PROTECTED]wrote:




Re: [gentoo-portage-dev] Re: search functionality in emerge

2008-12-01 Thread Emma Strubell
yes, yes, i know, you're right :]

and thanks a bunch for the outline! about the compression, I agree that it
would be a good idea, but I don't know how to implement it. not that it
would be difficult... I'm guessing there's a gzip module for python that
would make it pretty straightforward? I think I'm getting ahead of myself,
though. I haven't even implemented the suffix tree yet!

Emma

On Mon, Dec 1, 2008 at 7:20 PM, Tambet [EMAIL PROTECTED] wrote:

 2008/12/2 Emma Strubell [EMAIL PROTECTED]

 True, true. Like I said, I don't really use overlays, so excuse my
 igonrance.


 Do you know an order of doing things:

 Rules of Optimization:

- Rule 1: Don't do it.
- Rule 2 (for experts only): Don't do it yet.

 What this actually means - functionality comes first. Readability comes
 next. Optimization comes last. Unless you are creating a fancy 3D engine for
 kung fu game.

 If you are going to exclude overlays, you are removing functionality - and,
 indeed, absolutely has-to-be-there functionality, because noone would
 intuitively expect search function to search only one subset of packages,
 however reasonable this subset would be. So, you can't, just can't, add this
 package into portage base - you could write just another external search
 package for portage.

 I looked this code a bit and:
 Portage's __init__.py contains comment # search functionality. After
 this comment, there is a nice and simple search class.
 It also contains method def action_sync(...), which contains
 synchronization stuff.

 Now, search class will be initialized by setting up 3 databases - porttree,
 bintree and vartree, whatever those are. Those will be in self._dbs array
 and porttree will be in self._portdb.

 It contains some more methods:
 _findname(...) will return result of self._portdb.findname(...) with same
 parameters or None if it does not exist.
 Other methods will do similar things - map one or another method.
 execute will do the real search...
 Now - for package in self.portdb.cp_all() is important here ...it
 currently loops over whole portage tree. All kinds of matching will be done
 inside.
 self.portdb obviously points to porttree.py (unless it points to fake
 tree).
 cp_all will take all porttrees and do simple file search inside. This
 method should contain optional index search.

   self.porttrees = [self.porttree_root] + \
   [os.path.realpath(t) for t in 
 self.mysettings[PORTDIR_OVERLAY].split()]

 So, self.porttrees contains list of trees - first of them is root, others
 are overlays.

 Now, what you have to do will not be harder just because of having overlay
 search, too.

 You have to create method def cp_index(self), which will return dictionary
 containing package names as keys. For oroot... will be self.porttrees[1:],
 not self.porttrees - this will only search overlays. d = {} will be
 replaced with d = self.cp_index(). If index is not there, old version will
 be used (thus, you have to make internal porttrees variable, which contains
 all or all except first).

 Other methods used by search are xmatch and aux_get - first used several
 times and last one used to get description. You have to cache results of
 those specific queries and make them use your cache - as you can see, those
 parts of portage are already able to use overlays. Thus, you have to put
 your code again in beginning of those functions - create index_xmatch and
 index_aux_get methods, then make those methods use them and return their
 results unless those are None (or something other in case none is already
 legal result) - if they return None, old code will be run and do it's job.
 If index is not created, result is None. In index_** methods, just check if
 query is what you can answer and if it is, then answer it.

 Obviously, the simplest way to create your index is to delete index, then
 use those same methods to query for all nessecary information - and fastest
 way would be to add updating index directly into sync, which you could do
 later.

 Please, also, make those commands to turn index on and off (last one should
 also delete it to save disk space). Default should be off until it's fast,
 small and reliable. Also notice that if index is kept on hard drive, it
 might be faster if it's compressed (gz, for example) - decompressing takes
 less time and more processing power than reading it fully out.

 Have luck!

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Emma Strubell schrieb:
  2) does anyone really need to search an overlay anyway?

 Of course. Take large (semi-)official overlays like sunrise. They can
 easily be seen as a second portage tree.
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.9 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

 iEYEARECAAYFAkk0YpEACgkQ4UOg/zhYFuD3jQCdG/ChDmyOncpgUKeMuqDxD1Tt
 0mwAn2FXskdEAyFlmE8shUJy7WlhHr4S
 =+lCO
 -END PGP SIGNATURE-

 On Mon, Dec 1, 2008 at 5:17 PM, René 'Necoro' Neumann [EMAIL