Re: [gentoo-portage-dev] Re: [gentoo-dev] EBUILD_FORMAT support

2005-08-26 Thread Paul de Vrieze
On Friday 26 August 2005 17:11, Paul de Vrieze wrote:
> On Friday 26 August 2005 16:58, Ciaran McCreesh wrote:
> > On Fri, 26 Aug 2005 14:50:52 +0200 Paul de Vrieze <[EMAIL PROTECTED]>
> >
> > wrote:
> > | ps. People please be aware that this is still alpha in the sense of
> > | not being complete. For better working it should probably support
> > | if statements properly, and at least do variable substitution. It
> > | would mean that the parser would have to retain a state etc.
> >
> > Isn't this pretty much a waste of time if it can't handle all the
> > code in versionator? We're trying to move people away from ugly
> > unreliable manual substitutions towards readable, maintainable code
> > using the eclass...
>
> With lack of variable substitution support I mean that it just forwards
> the variable substitutions to the calling application (output). It
> should probably also be made more aware of the variables that are
> allways extended like USE and DEPEND.

I just checked the versionator eclass though, and indeed it wouldn't 
support it. Versionator uses functions inside the variables. The parser 
does not parse functions at all beyond being able to determine their end. 
Perhaps it would be best to handle versionator specially and internalize 
the functions. While it is possible to interpret the bash functions this 
would mean full bash function duplication, make the parser more complex 
and diminish the speed of the parser.

I could even do this function mimicking in such a way that nonsupported 
functions automatically get signaled as requiring compatibility mode 
(parser is uncertain about it's results, and the old parser should be 
used).

Paul

-- 
Paul de Vrieze
Gentoo Developer
Mail: [EMAIL PROTECTED]
Homepage: http://www.devrieze.net


pgpwkHEuQVuPo.pgp
Description: PGP signature


Re: [gentoo-portage-dev] Re: [gentoo-dev] EBUILD_FORMAT support

2005-08-26 Thread Paul de Vrieze
On Friday 26 August 2005 16:58, Ciaran McCreesh wrote:
> On Fri, 26 Aug 2005 14:50:52 +0200 Paul de Vrieze <[EMAIL PROTECTED]>
>
> wrote:
> | ps. People please be aware that this is still alpha in the sense of
> | not being complete. For better working it should probably support if
> | statements properly, and at least do variable substitution. It would
> | mean that the parser would have to retain a state etc.
>
> Isn't this pretty much a waste of time if it can't handle all the code
> in versionator? We're trying to move people away from ugly unreliable
> manual substitutions towards readable, maintainable code using the
> eclass...

With lack of variable substitution support I mean that it just forwards 
the variable substitutions to the calling application (output). It should 
probably also be made more aware of the variables that are allways 
extended like USE and DEPEND.

Paul

-- 
Paul de Vrieze
Gentoo Developer
Mail: [EMAIL PROTECTED]
Homepage: http://www.devrieze.net


pgpvJO0LC1yKa.pgp
Description: PGP signature


Re: [gentoo-portage-dev] Re: [gentoo-dev] EBUILD_FORMAT support

2005-08-26 Thread Paul de Vrieze
On Friday 26 August 2005 16:58, Ciaran McCreesh wrote:
> On Fri, 26 Aug 2005 14:50:52 +0200 Paul de Vrieze <[EMAIL PROTECTED]>
>
> wrote:
> | ps. People please be aware that this is still alpha in the sense of
> | not being complete. For better working it should probably support if
> | statements properly, and at least do variable substitution. It would
> | mean that the parser would have to retain a state etc.
>
> Isn't this pretty much a waste of time if it can't handle all the code
> in versionator? We're trying to move people away from ugly unreliable
> manual substitutions towards readable, maintainable code using the
> eclass...

Hey, it said "alpha". I've just been working on having it parse even more 
eclasses. It now handles multilib.eclass, and I'll be working on making 
eutils parse fully. I need to revamp the variable substitution 
recognition a bit. I'll make sure that it handles versionator too.

Paul

-- 
Paul de Vrieze
Gentoo Developer
Mail: [EMAIL PROTECTED]
Homepage: http://www.devrieze.net


pgpEpn5fN90Z9.pgp
Description: PGP signature


Re: [gentoo-portage-dev] Re: [gentoo-dev] EBUILD_FORMAT support

2005-08-26 Thread Ciaran McCreesh
On Fri, 26 Aug 2005 14:50:52 +0200 Paul de Vrieze <[EMAIL PROTECTED]>
wrote:
| ps. People please be aware that this is still alpha in the sense of
| not being complete. For better working it should probably support if 
| statements properly, and at least do variable substitution. It would
| mean that the parser would have to retain a state etc.

Isn't this pretty much a waste of time if it can't handle all the code
in versionator? We're trying to move people away from ugly unreliable
manual substitutions towards readable, maintainable code using the
eclass...

-- 
Ciaran McCreesh : Gentoo Developer (Vim, Shell tools, Fluxbox, Cron)
Mail: ciaranm at gentoo.org
Web : http://dev.gentoo.org/~ciaranm



pgpkdP2GfEi3Z.pgp
Description: PGP signature


Re: [gentoo-portage-dev] Re: [gentoo-dev] EBUILD_FORMAT support

2005-08-26 Thread Paul de Vrieze
On Friday 26 August 2005 09:35, Brian Harring wrote:
> Any parser that doesn't support full bash syntax isn't acceptable from
> where I sit; re: slow down, 2.1 is around 33% faster sourcing the
> whole tree (some cases 60% faster, some 5%, etc).  The speed up's are
> also what allow template's to be swapped, the eapi concept.

For the toplevel of the ebuilds there are many things that are not 
allowed. Basically things must be deterministic for the cache to work. I 
have built an extension that would parse 98% of current ebuilds properly, 
and much (more than 10 times) faster than the bash/ecache way. It is in 
the shape of a python module written in C. It just ignores the functions, 
so anything is allowed in there. As such the parser understands enough of 
bash to support it. Even variable substitution and inherit are supported. 
What's not supported is various kinds of uncommon substitution tricks 
that should probably not happen in the toplevel either.

Using EAPI would also allow to see something as capabilities. Say have 
portage support version 2-relaxed and version 2-strict. 2-relaxed has all 
the bash freedom and is parsed using bash. 2-strict would allow parsing 
by a faster parser module, but would limit the bash freedom. I don't say 
we have to do this, but if ebuild and eclass EAPI declarations follow a 
few very simple rules that are normally obeyed, it would be possible to 
support this thing in the future.

One of the problems I see with the current ebuild format is that it is 
impossible to do incompatible changes at all. This means that many 
features that might be desired can not be implemented. EAPI can relieve 
that. To make it easier there should be an easy way to get the EAPI of a 
package.

>
> I'd note limiting the bash capabilities is a restriction that
> transcends anything EAPI should supply; changes to what's possible in
> the language (a subset of bash syntax as you're suggesting) are a
> seperate format from where I draw the line in the sand.

What I suggest is making a policy that would make this possible in the 
future. Note that I do not wish to restrict any bash functionality in the 
various functions in the ebuild. 

> Mainly, limiting the syntax has the undesired affect of deviating from
> what users/devs know already; mistakes *will* occur.  QA tools can be
> written, but people are fallable; both in writing a QA tool, and
> abiding by the syntax subset allowed.

The QA tools would just be running the parser. If the parser chokes (which 
it doesn't easilly) then the ebuild does not conform to the correct 
syntax. It's even possible to just compare the variables returned. If 
they don't match, the format is wrong for the C parser.

>
> > The restriction I propose would be:
> > - If EAPI is defined in the ebuild it should be unconditional, on
> > it's own line in the toplevel of the ebuild before any functions are
> > defined. (preferably the first element after the comments and
> > whitespace)
> >
> > - If EAPI is not defined in the ebuild, but in an eclass, the inherit
> >   chain should be unconditional and direct. Further more in the
> > eclass the above rules should be followed.
> >
> > Please note that many of the conditions are allready true for current
> > ebuilds, just portage can "handle" more.
>
> inherit chain must be unconditional anyways.  re: eapi placement, I
> would view that as somewhat arbitrary; the question is what gain it
> would give.

The gain of putting it at the top would be that there are less chances for 
parsers to have choked on incompatible syntax. If EAPI is in the top, at 
some point incompatible syntax might be allowed, and older parsers could 
still retrieve the EAPI. Of course any syntax that works on 'egrep 
"^[ \t]*EAPI[ \t]*="' should be no problem.

>
> I'd wonder about the parsing speed of your parser; the difference
> between parsing ebuilds and running from cache metadata is several
> orders of magnitude differant- the current cache backend flat_list
> and portage design properly corrected ought to widen the gap too.
> General cache lookup is slow due to-
> A) bad call patterns, allowed by the api; N calls to get different
>bits of metadata from a cpv, resulting in potentially N to disk set
>of ops.
> B) default cache requires opening/closing a file per cpv lookup;
> syscall's are killer here.
> C) every metadata lookup incurs 2 stats, ebuild and cache file.

This parser was part of a stranded rewrite attempt. One of the features 
was that it regarded packages and package instances (specific files) as 
objects whose attributes would be lazilly evaluated. That means that it 
would parse if not available, lookup otherwise. The speed of "emerge -s" 
is stunning on the program as it uses a directory search which is orders 
of magnitudes faster than python doing the same thing.

> Getting to the point; cache is 100x to 400x faster then sourcing for
> <=2.0.51.  Haven't tested it under 2.1, should be different due to
> cache a

Re: [gentoo-portage-dev] Re: [gentoo-dev] EBUILD_FORMAT support

2005-08-26 Thread Paul de Vrieze
> Don't forget the fact that bash must be execed for normal parses, and
> that python has extremely slow string handling when not using one of
> the standard parsing modules (that work in C). To put my money where my
> mouth is, I've tarred up my code and put it on my dev space:
> http://dev.gentoo.org/~pauldv/portage_native-0.1.tar.bz2

I've fixed up a particular issue with it (getting into an endless loop) 
and made a very simple webpage for it:
http://dev.gentoo.org/~pauldv/

ps. People please be aware that this is still alpha in the sense of not 
being complete. For better working it should probably support if 
statements properly, and at least do variable substitution. It would mean 
that the parser would have to retain a state etc.

Paul

-- 
Paul de Vrieze
Gentoo Developer
Mail: [EMAIL PROTECTED]
Homepage: http://www.devrieze.net


pgpt11umMkQJs.pgp
Description: PGP signature