Re: [Bitbake-dev] Implementing support for the remote fetching of source control revision information

Richard Purdie Sat, 30 Jun 2007 17:03:41 -0700

Hi,

On Tue, 2007-06-26 at 16:37 +0200, Henryk Plötz wrote:
> Sorry for being away so long. When I last worked at it my frustration
> was growing and at some point it snapped and I threw the code into a
> corner. I then had some problems but am back now.


Ok, thanks for providing the patch. I've been giving this some careful
thought! :)

Due to the fact the changes impact a fairly wide spread of areas I think
we need to work on committing this in stages. As an example, I've just
looked at your changes to the svn fetcher and noticed the problems you
had with the existing code making it difficult to add an "svn info"
command. I've just committed a patch to bitbake trunk to do that without
the duplication of code you have. Your patch will need adapting to work
with it but that should be fairly trivial.

For reference, I'm thinking we should get this working in bitbake-trunk
and then we can backport the changes needed to bitbake 1.8 in one lump.

> I faced several problems and will detail the possible solutions I
> found. There still remains a problem with bitbake's caching algorithm:
> It doesn't properly pick up remote changes because it uses a cached
> copy of the generated version identifier.
> 
> What I did: I added to the fetcher module a function to export
> variables of the form SRCREV_refname for all URLs that have a refname
> parameter set.

What does refname default to? It always has to be set?

>  I have two concepts: revision_counter and
> revision_identifier. The latter is opaque and need not be comparable
> (e.g. git object hashes), while the former must be monotonically
> increasing. 

I like the idea of having both the revision identifier and the counter
exported from each fetcher, makes a lot of sense.

> For svn both are the same ('Last changed rev' from svn info), for git
> the revision_identifier is the hash and the revision_counter is created
> locally. To make this possible I introduced something called
> fetcher_state (e.g. a pickle'd file in FETCHERSTATEDIR, which I also
> added). (Note that I maybe should replace the fetcher_state code with
> the runcache code described below which handles locking better.)
>
> In order to not have to go to the network for each revision_identifier
> lookup I initially just cached it in the bb.data object that is always
> passed around. Didn't work. So I then tried to cache it in the bb.fetch
> module (e.g. like the already existing urldata). That did work,
> somewhat, but left me wondering for hours why it sometimes just did not
> work. With no indication to the problem whatsoever. I then found out
> that you sneaked an os.fork() into bb.runqueue which of course makes
> all caching in memory impossible.

"sneaked" isn't quite the right word, bitbake 1.8 was widely advertised
as multithreaded so it had to fork somewhere! It forks even in the
single thread case so people can't make bad assumptions ;-).

I'd be interested to know which cases this didn't work in. At parse time
bitbake is single threaded (and always will be effectively) so if you
calculate SRCREV at parse time, you should be able to save it. It does
throw the data away after parsing so we'd perhaps need some kind of
shared memory store to put it into but that should be entirely possible.

> So I need to do my revision caching on disk, even though I actually
> only want to cache for the duration of a bitbake run. I therefore
> instituted something I call "runcache" (actually "per-run cache", but
> that's just too long) that is just a pickle'd cache file, which is
> supposed to be cleared at the beginning of each run. With proper
> locking of course.

Ok, I think some kind of on disk caching will be needed since different
users are going to have different needs and hence different "cache
policies". Personally, I'd want to be able to manually reset the cache
when I wanted updates rather than having the system always go for them.
I think it would be better to have things in memory once bitbake is
running in a given session though so perhaps we can create some kind of
hybrid here with policy controlled by a variable?

> In the openmoko patch you will see another necessary workaround when
> setting the SRCREV_foo: in order to be able to bb.fetch.export_srcrevs()
> one must call bb.fetch.init() which calls initdata (), which might
> expand FILESPATH which might contain PV which might contain
> ${SRCREV_foo}.

That workaround is truly hideous ;-). Perhaps we should add an optional
option to bb.fetch.init() which allows it to only run against remote
sources (marking the fetchers as local/remote). If might be worth just
adding a special version of the function...

I'll continue to look at this tomorrow if I get a few moments...

Cheers,

Richard



_______________________________________________
Bitbake-dev mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/bitbake-dev

Re: [Bitbake-dev] Implementing support for the remote fetching of source control revision information

Reply via email to