Re: [gentoo-portage-dev] cache subsystem replacement

Brian Harring Mon, 14 Nov 2005 08:31:16 -0800

On Tue, Nov 15, 2005 at 01:13:58AM +0900, Jason Stubbs wrote:
> Was talking with a guy yesterday who mentioned he had 10 line patch that sped 
> up current portage a lot with regard to updating metadata. I asked him to 
> send it to me and here it is:
> 
> --- -??????2005-10-29 18:49:15.156173000 +0900
> +++ /usr/lib/portage/pym/portage_db_cpickle.py????2005-10-08 
> 11:13:37.000000000 
> +0900
> @@ -61,6 +61,9 @@
> ????????????????????????????????return False
> ????????????????????????????????????????????????
> ????????????????def sync(self):
> +??????????????????????????????return
> +
> +??????????????def realsync(self):
> ????????????????????????????????if self.modified:
> ????????????????????????????????????????????????try:
> ????????????????????????????????????????????????????????????????if 
> os.path.exists(self.filename):
> @@ -74,6 +77,6 @@
> ????????????????????????????????????????????????????????????????pass
> ????????????????
> ????????????????def close(self):
> -??????????????????????????????self.sync()
> +??????????????????????????????self.realsync()
> ????????????????????????????????self.db = None;


Ok, your mail client is screwing stuff up here ;)

The problem with the trick above is that, yeah, it delays syncs, but 
it also means if portage shuts down uncleanly _ever_, the entire 
eclass db of the old cache format is invalidated.  
All of it.  
Back to square one.
Massively bad thing, obviously.

This is why the default sync rate of cache classes in the rewrite is 
1 also, it updates every time a change is pushed to it.

> I remembered seeing sync_rate when glancing through the new cache stuff and 
> then had a look into mirror_cache(). Playing with trg_cache.sync(x), I got 
> the following numbers.
> 
> x        total #1  total #2  total #3  median sys
> 1          13.651    13.451    13.727       2.712
> 10         13.413    13.412    13.645       2.538
> 100        13.605    13.498    13.405       2.700
> 1000       13.673    13.726    13.748       2.839
> 10000      14.541    14.054    13.447       2.743
> 100000     13.973    13.951    14.512       2.881
> 1000000    13.583    13.622    13.935       2.669
> 
> Command run was:
> 
> rm -rf /var/cache/edb/dep/*; time emerge -q metadata
> 
> So what does changing the sync_rate actually do? Ease seeks? Should I re-run 
> these tests with a reboot in between? (And what happened to the 4 seconds I 
> was getting with earlier patches? Bug fixes turn quantity into quality? :)

Umm... 4 seconds?  Eh?

Regarding what the sync_rate does, if the target cache supports 
batched updates (think rdbms), it is capable of delaying upto N 
modifications prior to pushing the change out.

a cdb/cpickle cache backend would want to use this fex.

Meanwhile, why you're not seeing any variation- I'm pretty much 
positive you're using a cache that autocommits, meaning delayed 
sync'ing isn't possible.  Autocommit == can't batch, so sync rate 
isn't used/valid.

The only cache in the rewrite that doesn't autocommit is the sqlite 
implementation (which coincidentally is why sync rate exists; inserts 
into sqlite are [EMAIL PROTECTED]@#*ing slow).
~harring

pgpfEyeYnJQqs.pgp
Description: PGP signature

Re: [gentoo-portage-dev] cache subsystem replacement

Reply via email to