Hello,
Yesterday we did a post-mortem examination of a server (in a 3-node cluster)
that crashed because its disk was full. It has CouchDB 2.1.2 with little data
(1 GB), but `/var/lib/couchdb/` weighted 26 GB.
It looks like there is black matter in shard files.
For example, we have a small database `demo` (19 documents, it's never updated,
but often replicated to create clone databases). When looking at one of its
shard files, it's incredibly big:
1,4M /var/lib/couchdb/shards/00000000-1fffffff/demo.1520354435.couch
As its metadata says, its **uncompressed data size is 23 kB**, but its
**"active" size is 11311 kB**!
```
$ curl localhost:5984/demo
{
"db_name": "demo",
"update_seq": "79-g1AAAAKjeJ...",
"sizes": {
"file": 11679510,
"external": 23705,
"active": 11311270
},
"other": {
"data_size": 23705
},
"disk_size": 11679510,
"data_size": 11311270,
...
}
```
([full-output.json](https://github.com/apache/couchdb/files/2410386/full-output.json.txt))
We've tried compacting it, and also reduced `_rev_limits`, it had no effect.
Also, when replicating this database to a new one, the new size is normal:
8,2K /var/lib/couchdb/shards/00000000-1fffffff/demo-copy.1537540293.couch
It looks like this extra size comes from many `_local` documents (there are
17995 in this database -- but we created none).
Example local files:
```json
{
"id": "_local/0098b2317762223afc9a21b9e9ed894b",
"key": "_local/0098b2317762223afc9a21b9e9ed894b",
"value": {
"rev": "0-1"
},
"doc": {
"session_id": "4549feee1de3724118b38b56ac47acbe",
"source_last_seq": "19-g1AAAASteJzN0lFKxDAQBuCwVXRPsV4gJUnTtAXBvY...",
"replication_id_version": 4,
"history": [
{
"session_id": "4549feee1de3724118b38b56ac47acbe",
"start_time": "Wed, 27 Jun 2018 06:56:43 GMT",
"end_time": "Wed, 27 Jun 2018 08:56:47 GMT",
"start_last_seq": 0,
"end_last_seq": "19-g1AAAASteJzN0lFKxDAQBuCwVXRPsV4gJUnTtAXBv...",
"recorded_seq": "19-g1AAAASteJzN0lFKxDAQBuCwVXRPsV4gJUnTtAXBv...",
"missing_checked": 19,
"missing_found": 19,
"docs_read": 19,
"docs_written": 19,
"doc_write_failures": 0
}
]
}
}
```
or:
```json
{
"id": "_local/shard-sync-5WNtkwE5tM3B2A8XgrARtA-xrMMaaBJQHXoLx6HrvkN9w",
"key": "_local/shard-sync-5WNtkwE5tM3B2A8XgrARtA-xrMMaaBJQHXoLx6HrvkN9w",
"value": {
"rev": "0-1"
},
"doc": {
"seq": 9,
"target_uuid": "e41c58518830b41c2c908ce3d68dedb3",
"history": {
"[email protected]": [
{
"target_node": "[email protected]",
"target_uuid": "e41c58518830b41c2c908ce3d68dedb3",
"target_seq": 11,
"source_node": "[email protected]",
"source_uuid": "33f52231d93c3f99a2b683de7de4d111",
"source_seq": 9,
"timestamp": "2018-09-24T07:59:03.786497Z"
},
{
"target_node": "[email protected]",
"target_uuid": "e41c58518830b41c2c908ce3d68dedb3",
"target_seq": 8,
"source_node": "[email protected]",
"source_uuid": "33f52231d93c3f99a2b683de7de4d111",
"source_seq": 3,
"timestamp": "2018-06-02T05:10:44.108378Z"
}
]
}
}
}
```
(Note that timestamps are more than 3 months old.)
- Are these `_local` files the cause of the extra weight (× 26 compared to a
fresh replicated server with the same contents)?
- Are these `_local` files needed? Is it normal that they stay on disk for
months?
- How to prevent creation of this black matter?
Thanks in advance for helping us debugging this strange behavior. I'd be happy
to provide more info if needed.
[ Full content available at: https://github.com/apache/couchdb/issues/1621 ]
This message was relayed via gitbox.apache.org for [email protected]