We have been working on the assumption that Windows preallocates space and 
then lazily inits it to zero when it is read. Therefore, the tactic has been 
to set the file length, which shouldn't take long, and then we know that 
space is available.

Unfortunately, this is not the case. Unless we call SetFileValidData(), which 
is not readily available through Java, Windows will write all the zeros we 
just hoped we had skipped over. Recently a user complained that on Vista with 
an encrypted drive, this causes a single syscall to hang the process with 
100% kernel CPU, making it effectively unkillable, for a considerable time 
(at least 20 minutes, probably more).

Thus the obvious solution is to write pseudorandom data to the file until we 
reach the desired length, during startup after creating a new salted hash 
datastore, and *not* set the file length first. This should take almost 
exactly the same time as just setting the file length, it has some slight 
security benefits, and it won't cause the process to be unkillable.

On *nix, we can rely on holes, so preallocation can be optional, and can 
happen in the background after startup. Assuming that fragmentation isn't a 
problem.

Thoughts? What code changes are needed? We already have support for 
preallocation, but we do it after setting the file length...

Sources:
http://msdn.microsoft.com/en-us/library/aa365544(VS.85).aspx

"The SetFileValidData function allows you to avoid filling data with zeros 
when writing nonsequentially to a file. The function makes the data in the 
file valid without writing to the file. As a result, although some 
performance gain may be realized, existing data on disk from previously 
existing files can inadvertently become available to unintended readers. The 
following paragraphs provide a more detailed description of this potential 
security and privacy issue."

[22:58] <phrosty> hmm salted hash store doesn't work too well does it
[23:04] <toad_> phrosty: what makes you say that?
[23:04] <toad_> phrosty: most reports have been very positive
[23:12] <phrosty> something freenet does while migrating (on fresh install 
even.. empty store) causes java to enter a syscall that makes the kernel use 
100% cpu
[23:17] <toad_> hmm
[23:17] * toad_ wonders if that's the padding...
[23:17] <toad_> what OS?
[23:17] <phrosty> vista
[23:17] <toad_> compressed disk?
[23:17] <phrosty> nope
[23:17] <toad_> hmmm
[23:17] <toad_> post wrapper.log?
[23:18] <phrosty> it is encrypted (truecrypt) but i've never seen that use 
much cpu... disks are too slow to get loaded from it.
[23:18] <toad_> it might be filling the store with random data, i'm not sure 
at what point exactly it does that...
[23:18] <toad_> we can turn that off if it's a problem, i had hoped that it 
would make space usage more predictable :|
[23:18] <toad_> as well as arguably providing a minor security improvement
[23:19] <toad_> please post your wrapper.log ... then ... hmmm ...
[23:19] <toad_> are there rapidly growing files in the datastore/ directory?
[23:21] <phrosty> it's over now (took ~2hr last night)
[23:21] <toad_> it's finished?
[23:21] <toad_> well what are you complaining about then? :)
[23:21] <phrosty> long syscalls are odd behavior
[23:22] <phrosty> causing kernel to do 100% cpu was weird so i tried to end 
the task, but syscall blocks that.
[23:22] <toad_> it sounds like it was filling the store with random data, and 
your encrytion took 100% CPU
[23:22] <toad_> well no syscall should use 100% kernel cpu for a long period
[23:23] <phrosty> does freenet open the files with any special modes?  
unbuffered/etc
[23:23] <toad_> no
[23:23] <toad_> ah actually
[23:23] <toad_> windows preallocates space when we seek the end
[23:23] <toad_> that might be a single system call
[23:23] <phrosty> that'd be it
[23:23] <toad_> which could take some time for a LARGE datastore
[23:24] <phrosty> yea, i've got 200GB
[23:24] <toad_> so that's your answer
[23:24] <toad_> unavoidable i'm afraid
[23:25] <toad_> well we could stagger it somehow
[23:25] <toad_> i doubt it's a problem for unencrypted stores...
[23:25] <phrosty> wait - so you set the file size (which initializes with 0) 
then initialize with random data?  that would write everything twice
[23:26] <toad_> afaik it doesn't actually write the zeros
[23:26] <toad_> on windows
[23:27] <toad_> surely it just allocated the blocks in the allocation table 
etc?
[23:27] <-- Mathiasdm has left this server ("bye").
[23:27] <toad_> that's what i heard anyway
[23:27] <toad_> filling it up with randomness on windows is purely a security 
measure; on unix it's to use the space
[23:27] <phrosty> depends on how you do it, but a long syscall would 
definitely point me in that direction since it's a blocking call
[23:27] <toad_> maybe we should turn it off by default
[23:27] <toad_> it seems likely yes
[23:28] <toad_> do we need to do it gradually? it'd be several syscalls 
instead of 1...
[23:29] <toad_> would that help in any meaningful way?
[23:29] <phrosty> that'd be good to allow ctrl+alt+del.  if random init can be 
turned off (and you don't mind stale data from old files for the content), 
you can use SetFileValidData() for a massive perf boost (instant, zero-frag).  
if you do mind stale data and want 0-init, sparse files could be used 
(instant, frag, slight overhead).
[23:30] <toad_> hmmm
[23:30] <toad_> so it actually writes the zeros?
[23:30] <toad_> by default?
[23:30] <phrosty> yea, windows always zero-inits
[23:30] <toad_> not lazily?
[23:32] <phrosty> indeed.  if you do want random init, why not just continue 
writing blocks until you hit the size?  instead of initializing the size 
beforehand
[23:32] <toad_> i heard it would be faster :|
[23:32] <toad_> it should give less fragmentation at least, surely?
[23:33] <phrosty> if other apps are writing to the same drive, probably.
[23:33] <toad_> so you think it would be no slower on windows to block startup 
writing random data?
[23:33] <toad_> no slower than seeking causing windows to write tons of zeros?
[23:33] <toad_> hmm
[23:33] <phrosty> yea, should not be slower.  i'd rather just use 
SetFileValidData() though, as it allocates all the blocks on disk without any 
init.
[23:33] <toad_> do you have a reference for this?
[23:34] <toad_> afaik SetFileValidData isn't available from java
[23:34] <toad_> we'd have to use JNI, that would be a pita
[23:34] <phrosty> probably :(
[23:34] <toad_> do you have a reference url?
[23:34] <phrosty> ms-help://MS.MSDNQTR.v90.en/fileio/fs/setfilevaliddata.htm
[23:34] <phrosty> oh, duh
[23:34] <phrosty> lol
[23:35] <toad_> hah
[23:35] <phrosty> http://msdn.microsoft.com/en-us/library/aa365544(VS.85).aspx
[23:35] <toad_> on the web?
[23:35] <-- sanity_ has left this server (Read error: 110 (Connection timed 
out)).
[23:37] <toad_> ok, i'll deal with this, thanks...
[23:41] <toad_> phrosty: how long did it take?
[23:43] <phrosty> it took about 20min for the syscall to end, but it had been 
running for a while before i noticed.
[23:44] <phrosty> so i don't know
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20080924/2c70a9ea/attachment.pgp>

Reply via email to