Indeed puzzling. If you delete the database (DELETE /dbname) and if this succeeds (2xx response) then all of the db data is deleted fully. If you think you're seeing data persisting after deletion you have a problem (the delete is failing, or you're not really deleting the db, or something extremely strange is happening).
Another cause of invisible bloat would be failed writes (especially ones with attachment data) as we'll write the data as we go but if the write then fails that leaves the partial write in the file with nothing pointing back at it. Compaction will clean that up, of course. Compaction is essential in practically all cases. You could maybe get away with disabling it if you don't create, update or delete a document but even in that case the files will grow on restart (and perhaps when the db is closed and reopened?) as we'll append a new database footer. -- Robert Samuel Newson rnew...@apache.org On Thu, 2 May 2019, at 18:02, Adam Kocoloski wrote: > Hi Willem, > > Good question. CouchDB has a 100% copy-on-write storage engine, > including for all updates to btree nodes, etc. so any updates to the > database will necessarily increase the file size before compaction. > Looking at your info I don’t see a heavy source of updates, so it is a > little puzzling. > > Adam > > > > On May 2, 2019, at 12:53 PM, Willem Bison <wil...@nappkin.nl> wrote: > > > > Hi Adam, > > > > I ran "POST compact" on the DB mentioned in my post and 'disk_size' went > > from 729884227 (yes, it had grown that much in 1 hour !?) to 1275480. > > > > Wow. > > > > I disabled compacting because I thought it was useless in our case since > > the db's and the docs are so small. I do wonder how it is possible for a db > > to grow so much when its being deleted several times a week. What is all > > the 'air' ? > > > > On Thu, 2 May 2019 at 18:31, Adam Kocoloski <kocol...@apache.org> wrote: > > > >> Hi Willem, > >> > >> Compaction would certainly reduce your storage space. You have such a > >> small number of documents in these databases that it would be a fast > >> operation. Did you try it and run into issues? > >> > >> Changing cluster.q shouldn’t affect the overall storage consumption. > >> > >> Adam > >> > >>> On May 2, 2019, at 12:15 PM, Willem Bison <wil...@nappkin.nl> wrote: > >>> > >>> Hi, > >>> > >>> Our CouchDb 2.3.1 standalone server (AWS Ubuntu 18.04) is using a lot of > >>> disk space, so much so that it regularly causes a disk full and a crash. > >>> > >>> The server contains approximately 100 databases each with a reported > >>> (Fauxton) size of less than 2.5Mb and less than 250 docs. Yesterday the > >>> 'shards' folders combined exceeded a total 14G causing the server to > >> crash. > >>> > >>> The server is configured with > >>> cluster.n = 1 and > >>> cluster.q = 8 > >>> because that was suggested during setup. > >>> > >>> When I write this the 'shards' folders look like this: > >>> /var/lib/couchdb/shards# du -hs * > >>> 869M 00000000-1fffffff > >>> 1.4G 20000000-3fffffff > >>> 207M 40000000-5fffffff > >>> 620M 60000000-7fffffff > >>> 446M 80000000-9fffffff > >>> 458M a0000000-bfffffff > >>> 400M c0000000-dfffffff > >>> 549M e0000000-ffffffff > >>> > >>> One of the largest files is this: > >>> curl localhost:5984/xxxxxxx_1590 > >>> { > >>> "db_name": "xxxxxxx_1590", > >>> "purge_seq": > >>> > >> "0-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXIwNDLXMwBCwxygFFNSApBMqv___39WIgMedXksQJKhAUgBlc4nRu0DiFoC5iYpgOy3J9L-BRAz9-NXm8iQJE_YYgeQxfFEWnwAYvF9oNosADncXo4", > >>> "update_seq": > >>> > >> "3132-g1AAAAFWeJzLYWBg4MhgTmEQTM4vTc5ISXIwNDLXMwBCwxygFFMiQ5L8____sxI18ChKUgCSSfYgdUkMDNw1-JQ6gJTGg42UxacuAaSuHqxOAo-6PBYgydAApIBK52clchNUuwCidn9Wog5BtQcgau9nJQoTVPsAohboXsksAJuwX9Y", > >>> "sizes": { > >>> "file": 595928643, > >>> "external": 462778, > >>> "active": 1393380 > >>> }, > >>> "other": { > >>> "data_size": 462778 > >>> }, > >>> "doc_del_count": 0, > >>> "doc_count": 74, > >>> "disk_size": 595928643, > >>> "disk_format_version": 7, > >>> "data_size": 1393380, > >>> "compact_running": false, > >>> "cluster": { > >>> "q": 8, > >>> "n": 1, > >>> "w": 1, > >>> "r": 1 > >>> }, > >>> "instance_start_time": "0" > >>> } > >>> > >>> curl localhost:5984/xxxxxxx_1590/_local_docs > >>> {"total_rows":null,"offset":null,"rows":[ > >>> > >> {"id":"_local/189d9109518d1a2167b06ca9639af5f2ba16f0a5","key":"_local/189d9109518d1a2167b06ca9639af5f2ba16f0a5","value":{"rev":"0-3022"}}, > >>> > >> {"id":"_local/7b3e0d929201afcea44b237b5b3e86b35ff924c6","key":"_local/7b3e0d929201afcea44b237b5b3e86b35ff924c6","value":{"rev":"0-18"}}, > >>> > >> {"id":"_local/7da4a2aaebc84d01ba0e2906ac0fcb82d96bfe05","key":"_local/7da4a2aaebc84d01ba0e2906ac0fcb82d96bfe05","value":{"rev":"0-3749"}}, > >>> > >> {"id":"_local/9619b06f20d26b076e4060d050dc8e3bde878920","key":"_local/9619b06f20d26b076e4060d050dc8e3bde878920","value":{"rev":"0-172"}} > >>> ]} > >>> > >>> Each database push/pull replicates with a small number of clients (< 10). > >>> Most of the documents contain orders that are shortlived. We throw away > >> all > >>> db's 3 times a week as a brute force purge. > >>> Compacting has been disabled because it takes too much cpu and was > >>> considered useless in our case (small db's, purging). > >>> > >>> I read this: > >>> https://github.com/apache/couchdb/issues/1621 > >>> but I'm not sure how it helps me. > >>> > >>> These are my questions: > >>> How is it possible that such a small db occupies so much space? > >>> What can I do to reduce this? > >>> Would changing 'cluster.q' have any effect or would the same amount of > >>> bytes be used in less folders? (am I correct in assuming that cluster.q > >>> 1 > >>> is pointless in standalone configuration?) > >>> > >>> Thanks! > >>> Willem > >> > >> > >