IIRC, it's not really 2x nodes -- it's one extra row for each string key. So if a directory listing would normally consume a single string row, yes, theres 1 + 1 = 2 (or, 2x) rows used. But if a file's contents would consume 10 strings rows, then it's still just the 1 additional empty row. THAT SAID, it does certainly seem inefficient.
Wanna dive into the code and work up a patch? Vadim Chekan wrote: > Hi all, > > Out of curiosity I wrote a script which dumps subversion bdb tables > and found interesting anomaly in "strings" table. > Every string there has a duplicate with empty value. > It is my understanding that "strings" allows duplicates to store very > large content in chunks under the same key. That's fine. But why every > small string (like file name) has a key duplicate? Looks like a bug to > me. > This bug does not prevent normal functioning because strings are > concatenated when read and empty value does not harm, but from > performance point of view, having 2x nodes in btree is not good. > > Here is what I'm talking about: > =========== nodes ================ > k:'0.0.0' v:'((dir 1 / 0 1 0) 0 0 )' > k:'0.0.1' v:'((dir 1 / 5 0.0.0 1 1 1 0 1 0) 0 1 0)' > k:'1.0.1' v:'((file 9 /test.txt 0 1 0 1 0 1 0) 0 1 1)' > k:'next-key' v:'2' > =========== strings ================ > k:'0' v:'' > k:'0' v:'((test.txt 5 1.0.1))' > k:'1' v:'' > k:'1' v:'aaa' > k:'next-key' v:'2' > =========== revisions ================ > k:'1' v:'(revision 1 0)' > k:'2' v:'(revision 1 1)' > > Pay attention to "strings" key. Empty value is repeated for every string. > > My environment: > svn, version 1.6.5 (r38866) > Linux ubuntu 2.6.31-17-generic #54-Ubuntu SMP Thu Dec 10 16:20:31 UTC > 2009 i686 GNU/Linux > > Here is the script: > =========================================================== > #!/usr/bin/ruby > require 'bdb' > > $env = BDB::Env.open('repo3/db', flags=BDB::INIT_MPOOL, mode=0) > > def list_content(file, db_type) > puts "=========== #{file} ================" > db = $env.open_db(db_type, name=file) > db.each do |k,v| > puts "k:'#{k}' v:'#{v}'" > end > > db.close > end > > # checksum-reps > %w(changes copies nodes node-origins miscellaneous representations > strings transactions). > each{|f| list_content(f, BDB::BTREE) } > > %w(revisions uuids). > each{|f| list_content(f, BDB::RECNO) } > =========================================================== > > -- C. Michael Pilato <cmpil...@collab.net> CollabNet <> www.collab.net <> Distributed Development On Demand
signature.asc
Description: OpenPGP digital signature