Hi All, I'm seriously considering adding an interface to the Cyrus DB abstraction. It's basically "fetch_next" - which is like fetch, but returns the alphabetically NEXT record rather than the given key. It already exists (pretty much) inside foreach, but I'd like to have it outside.
Basically there are a couple of reasons - but the main one is to be able to chunk things - for example duplicate_prune. I just watched a duplicate_prune for over 1/2 hour doing stacks of fdatasync as it wrote every single delete. I'd like to do a few thousand records at a time. The ideal (fastest!) way would be to read a few thousand records under a single transaction, commit all the deletes, then start again. At the moment the only interfaces are: a) one transaction per record; or b) one transaction for the entire file There's no nice way to break it up into bite sized chunks! What do you think? Any suggestions about what you'd want or what you might use? Does this interface seem sane? Regards, Bron.
