Hi Benjamin Scott wrote:
> Hello again, > > As I mentioned in my message entitled "New to ReiserFS...", I have some > questions regarding the mkreiserfs utility and the choice of hash function. > I have Linux kernel 2.2.19, reiserfs 3.5.34, and reiserfsprogs 3.x.0j. I am > looking at the manual page for mkreiserfs, specifically, the "-h" option, > and wondering which hash function to choose. > > From the mailing list archives, and some web searches, I gather that > "rupasov" is depreciated, "r5" is a good general choice, and "tea" is the > most robust, but carries a performance penalty. > > Now, better performance is always welcome, but data integrity and > availability is of paramount importance. Does anyone have any opinions on > how likely, really, a hash collision is with r5? If you need figures, we > are talking between 500 and 600 GB of data, with maybe ten thousand files in > a directory, max, and file sizes ranging from 20 KB to 50 MB. > It is probably possible to break 'r5' and generate names having the same value of r5. But, I do not think that it will ever fail if you do not do that intentionally. > > Also: In my web searches, I found this webpage > > http://hints.linuxfromscratch.org/hints/reiserfs.txt > > and it made me nervous. The phrase "lost data" will do that to me. :) Of > course, that page is not a definitive source, but I am having trouble > finding a source I could call definitive. > > Basically, what I am wondering is: If the worst-case does occur, will > ReiserFS fail the operation gracefully, or does it do something nasty, like > eat the data or scramble the filesystem? An "Out of space" or similar error > message is acceptable for pathological cases, but silently corrupting data > is a thought that gives me nightmares. Is that webpage bogus, or is this a > real threat? reiserfs should not scramble filesystem when number of hash collisions reaches its limit. It should return -125 in 2.2 and -EBUSY in 2.4 Thanks, vs
