IIRC, it's not really 2x nodes -- it's one extra row for each string key.
So if a directory listing would normally consume a single string row, yes,
theres 1 + 1 = 2 (or, 2x) rows used.  But if a file's contents would consume
10 strings rows, then it's still just the 1 additional empty row.  THAT
SAID, it does certainly seem inefficient.

Wanna dive into the code and work up a patch?


Vadim Chekan wrote:
> Hi all,
> 
> Out of curiosity I wrote a script which dumps subversion bdb tables
> and found interesting anomaly in "strings" table.
> Every string there has a duplicate with empty value.
> It is my understanding that "strings" allows duplicates to store very
> large content in chunks under the same key. That's fine. But why every
> small string (like file name) has a key duplicate? Looks like a bug to
> me.
> This bug does not prevent normal functioning because strings are
> concatenated when read and empty value does not harm, but from
> performance point of view, having 2x nodes in btree is not good.
> 
> Here is what I'm talking about:
> =========== nodes  ================
> k:'0.0.0' v:'((dir 1 / 0  1 0) 0  0 )'
> k:'0.0.1' v:'((dir 1 / 5 0.0.0 1 1 1 0 1 0) 0  1 0)'
> k:'1.0.1' v:'((file 9 /test.txt 0  1 0 1 0 1 0) 0  1 1)'
> k:'next-key' v:'2'
> =========== strings  ================
> k:'0' v:''
> k:'0' v:'((test.txt 5 1.0.1))'
> k:'1' v:''
> k:'1' v:'aaa'
> k:'next-key' v:'2'
> =========== revisions  ================
> k:'1' v:'(revision 1 0)'
> k:'2' v:'(revision 1 1)'
> 
> Pay attention to "strings" key. Empty value is repeated for every string.
> 
> My environment:
> svn, version 1.6.5 (r38866)
> Linux ubuntu 2.6.31-17-generic #54-Ubuntu SMP Thu Dec 10 16:20:31 UTC
> 2009 i686 GNU/Linux
> 
> Here is the script:
> ===========================================================
> #!/usr/bin/ruby
> require 'bdb'
> 
> $env = BDB::Env.open('repo3/db', flags=BDB::INIT_MPOOL, mode=0)
> 
> def list_content(file, db_type)
>     puts "=========== #{file}  ================"
>     db = $env.open_db(db_type, name=file)
>     db.each do |k,v|
>         puts "k:'#{k}' v:'#{v}'"
>     end
> 
>     db.close
> end
> 
> # checksum-reps
> %w(changes copies nodes node-origins miscellaneous representations
> strings transactions).
>     each{|f| list_content(f, BDB::BTREE) }
> 
> %w(revisions uuids).
>    each{|f| list_content(f, BDB::RECNO) }
> ===========================================================
> 
> 


-- 
C. Michael Pilato <cmpil...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to