We have been working on the assumption that Windows preallocates space and then lazily inits it to zero when it is read. Therefore, the tactic has been to set the file length, which shouldn't take long, and then we know that space is available.
Unfortunately, this is not the case. Unless we call SetFileValidData(), which is not readily available through Java, Windows will write all the zeros we just hoped we had skipped over. Recently a user complained that on Vista with an encrypted drive, this causes a single syscall to hang the process with 100% kernel CPU, making it effectively unkillable, for a considerable time (at least 20 minutes, probably more). Thus the obvious solution is to write pseudorandom data to the file until we reach the desired length, during startup after creating a new salted hash datastore, and *not* set the file length first. This should take almost exactly the same time as just setting the file length, it has some slight security benefits, and it won't cause the process to be unkillable. On *nix, we can rely on holes, so preallocation can be optional, and can happen in the background after startup. Assuming that fragmentation isn't a problem. Thoughts? What code changes are needed? We already have support for preallocation, but we do it after setting the file length... Sources: http://msdn.microsoft.com/en-us/library/aa365544(VS.85).aspx "The SetFileValidData function allows you to avoid filling data with zeros when writing nonsequentially to a file. The function makes the data in the file valid without writing to the file. As a result, although some performance gain may be realized, existing data on disk from previously existing files can inadvertently become available to unintended readers. The following paragraphs provide a more detailed description of this potential security and privacy issue." [22:58] <phrosty> hmm salted hash store doesn't work too well does it [23:04] <toad_> phrosty: what makes you say that? [23:04] <toad_> phrosty: most reports have been very positive [23:12] <phrosty> something freenet does while migrating (on fresh install even.. empty store) causes java to enter a syscall that makes the kernel use 100% cpu [23:17] <toad_> hmm [23:17] * toad_ wonders if that's the padding... [23:17] <toad_> what OS? [23:17] <phrosty> vista [23:17] <toad_> compressed disk? [23:17] <phrosty> nope [23:17] <toad_> hmmm [23:17] <toad_> post wrapper.log? [23:18] <phrosty> it is encrypted (truecrypt) but i've never seen that use much cpu... disks are too slow to get loaded from it. [23:18] <toad_> it might be filling the store with random data, i'm not sure at what point exactly it does that... [23:18] <toad_> we can turn that off if it's a problem, i had hoped that it would make space usage more predictable :| [23:18] <toad_> as well as arguably providing a minor security improvement [23:19] <toad_> please post your wrapper.log ... then ... hmmm ... [23:19] <toad_> are there rapidly growing files in the datastore/ directory? [23:21] <phrosty> it's over now (took ~2hr last night) [23:21] <toad_> it's finished? [23:21] <toad_> well what are you complaining about then? :) [23:21] <phrosty> long syscalls are odd behavior [23:22] <phrosty> causing kernel to do 100% cpu was weird so i tried to end the task, but syscall blocks that. [23:22] <toad_> it sounds like it was filling the store with random data, and your encrytion took 100% CPU [23:22] <toad_> well no syscall should use 100% kernel cpu for a long period [23:23] <phrosty> does freenet open the files with any special modes? unbuffered/etc [23:23] <toad_> no [23:23] <toad_> ah actually [23:23] <toad_> windows preallocates space when we seek the end [23:23] <toad_> that might be a single system call [23:23] <phrosty> that'd be it [23:23] <toad_> which could take some time for a LARGE datastore [23:24] <phrosty> yea, i've got 200GB [23:24] <toad_> so that's your answer [23:24] <toad_> unavoidable i'm afraid [23:25] <toad_> well we could stagger it somehow [23:25] <toad_> i doubt it's a problem for unencrypted stores... [23:25] <phrosty> wait - so you set the file size (which initializes with 0) then initialize with random data? that would write everything twice [23:26] <toad_> afaik it doesn't actually write the zeros [23:26] <toad_> on windows [23:27] <toad_> surely it just allocated the blocks in the allocation table etc? [23:27] <-- Mathiasdm has left this server ("bye"). [23:27] <toad_> that's what i heard anyway [23:27] <toad_> filling it up with randomness on windows is purely a security measure; on unix it's to use the space [23:27] <phrosty> depends on how you do it, but a long syscall would definitely point me in that direction since it's a blocking call [23:27] <toad_> maybe we should turn it off by default [23:27] <toad_> it seems likely yes [23:28] <toad_> do we need to do it gradually? it'd be several syscalls instead of 1... [23:29] <toad_> would that help in any meaningful way? [23:29] <phrosty> that'd be good to allow ctrl+alt+del. if random init can be turned off (and you don't mind stale data from old files for the content), you can use SetFileValidData() for a massive perf boost (instant, zero-frag). if you do mind stale data and want 0-init, sparse files could be used (instant, frag, slight overhead). [23:30] <toad_> hmmm [23:30] <toad_> so it actually writes the zeros? [23:30] <toad_> by default? [23:30] <phrosty> yea, windows always zero-inits [23:30] <toad_> not lazily? [23:32] <phrosty> indeed. if you do want random init, why not just continue writing blocks until you hit the size? instead of initializing the size beforehand [23:32] <toad_> i heard it would be faster :| [23:32] <toad_> it should give less fragmentation at least, surely? [23:33] <phrosty> if other apps are writing to the same drive, probably. [23:33] <toad_> so you think it would be no slower on windows to block startup writing random data? [23:33] <toad_> no slower than seeking causing windows to write tons of zeros? [23:33] <toad_> hmm [23:33] <phrosty> yea, should not be slower. i'd rather just use SetFileValidData() though, as it allocates all the blocks on disk without any init. [23:33] <toad_> do you have a reference for this? [23:34] <toad_> afaik SetFileValidData isn't available from java [23:34] <toad_> we'd have to use JNI, that would be a pita [23:34] <phrosty> probably :( [23:34] <toad_> do you have a reference url? [23:34] <phrosty> ms-help://MS.MSDNQTR.v90.en/fileio/fs/setfilevaliddata.htm [23:34] <phrosty> oh, duh [23:34] <phrosty> lol [23:35] <toad_> hah [23:35] <phrosty> http://msdn.microsoft.com/en-us/library/aa365544(VS.85).aspx [23:35] <toad_> on the web? [23:35] <-- sanity_ has left this server (Read error: 110 (Connection timed out)). [23:37] <toad_> ok, i'll deal with this, thanks... [23:41] <toad_> phrosty: how long did it take? [23:43] <phrosty> it took about 20min for the syscall to end, but it had been running for a while before i noticed. [23:44] <phrosty> so i don't know -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 827 bytes Desc: not available URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20080924/2c70a9ea/attachment.pgp>
