> The application will have a large number of records, with the records > consisting of a fixed part and a number (n) of periodic parts. > * The fixed part is updated occasionally. > * The periodic parts are never updated, but a new one is added every 5 to 10 > minutes. Only the last n periodic parts need to be kept, so that the oldest > one can be deleted after adding a new part. > * The records will always be read completely (meaning fixed part and all > periodic parts). Reads are less frequent than writes. > The application will be running continuosly, at least for a few weeks, so > there will be many, many stale periodic parts, so I'm a bit worried about > data comsumption and compactions.
I was going to hit send on a partial recommendation but realized I don't really have enough information given that you seem to be making pretty specific optimizations. You say writes are more frequent than reads. To what extent - are reads *very* infrequent to the point that the performance of the reads are almost completely irrelevant? You seem worried about tombstones and data size. Is the issue that you're expecting huge amounts of data and disk space/compaction frequency is an issue? Are you expecting write load to be high such that performance of writes (and compaction) is a concern, or is it mostly about slowly building up huge amounts of data that you want to be compact on disk? -- / Peter Schuller
