Response to a long thread on FMS about how to reduce Freenet's disk I/O, what are realistic system requirements, when can we expect to see SSDs taking over, and will Freenet kill commodity disks as a matter of routine.
================================================================================ I'm just going to reply to everyone here. First, hard disk writes are not the only limited resource: As has been pointed out, RAM is limited too. So we can't assume that node.db4o will fit in RAM (much less persistent-blob.tmp, which is intimately related to node.db4o). But we should make good use of this happy situation when it occurs. Second, if there is plenty of RAM, the OS will cache the file. So reads aren't a problem - they just create more CPU usage. Writes are the big problem: How do we reduce the number of database writes? As far as I know, if the node is "idle", in the sense that the requests are failing, we do no database writes at all. However, there may be some maintenance. One big question is, is a short burst of writes every so often preferable to writes every second? Possible benefits: - It's closer to what the hard disks expect so hopefully will have less impact on hard disk lifespans. - The seeks can be quite small, so it should be fast-ish. Possible drawbacks are that since it is more intense it might have a bigger negative impact on the rest of the system (for a short time). Which might be bad for e.g. online gaming, although we will want a gamer mode or something eventually (ideally with platform specific autodetection helpers). Considering the datastore alone, it is perfectly feasible, and safe, to aggregate writes in memory, provided there is sufficient memory. (Based on the overall heap limit, which in turn is based on the detected amount of memory). One complication is if the data doesn't hit the main datastore it should still be in the ULPR/slashdot cache, so we'd need to allow that to access the in-memory blocks where appropriate. Now, regarding the database (node.db4o): - It is hard to make uploads not use lots of database queries without substantial changes. I may look into it but expect it to be difficult. - Accepting limited potential data loss is not, at present, an option. The database is more likely to completely die than just lose some changes. This is why we fsync on commit, and commit frequently. Since we abuse the nominally ACID nature of the database (we never rollback), we can (and do) commit only when something important happens or periodically, but there is still a lot of traffic. Sadly Freetalk/WoT do use rollback so has to commit EVERY TIME. - Periodic backups (synchronized with the persistent-blob file) could avoid the need for fsync. This would greatly reduce the actual disk writes by allowing the operating system to optimise them properly. - In theory we could do more aggressive caching once we have this infrastructure, up to and including keeping the whole thing in RAM and writing it periodically. We would need to smoothly handle it growing so it doesn't fit. - The actual blocks are just big linear writes, so it's much more efficient to buffer database writes than to buffer unwritten blocks. If we have a lot of RAM it may make sense to do both. Which would further complicate the above. - Database jobs can be very slow especially if RAM is limited (meaning we have to do lots of reads because the OS isn't caching the whole file). Things like unpacking the next layer of a splitfile can be hideously slow. We can't necessarily aggregate commits, at least not at the job level. On the other hand, we DO aggregate commits at the job level to some degree, in the sense that while a big job such as above is running, the new blocks coming in are queued; eventually we stop fetching new blocks. IIRC mostly they are written to disk to save memory. :| A lot of the above depends on an awful lot of RAM being available. Possibly we should tweak the autodetection. Certainly we will affect system performance by using too much RAM, just as we do with too many disk writes. Unfortunately there are other places we write frequently such as the peers files too. These need debugging. So, what of the above is not already on the bug tracker? 1. Do we want to aggregate writes to the datastore and write them periodically? (Implementation issues mentioned above) 2. Caching of blocks for persistent-blob.tmp, as well as of the database itself, if we have lots of RAM, after implementing auto-backups. 3. Can we give Freenet any more RAM? The current allocation (the wrapper memory limit, which does not include things like thread stacks) is: <512MB -> 128M <1GB -> 192M <2G -> 256M Else 512M -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20120829/bdf63ad3/attachment.pgp>