Hello,

Yesterday we did a post-mortem examination of a server (in a 3-node cluster) 
that crashed because its disk was full. It has CouchDB 2.1.2 with little data 
(1 GB), but `/var/lib/couchdb/` weighted 26 GB.

It looks like there is black matter in shard files.

For example, we have a small database `demo` (19 documents, it's never updated, 
but often replicated to create clone databases). When looking at one of its 
shard files, it's incredibly big:

    1,4M /var/lib/couchdb/shards/00000000-1fffffff/demo.1520354435.couch

As its metadata says, its **uncompressed data size is 23 kB**, but its 
**"active" size is 11311 kB**!

```
$ curl localhost:5984/demo
{
  "db_name": "demo",
  "update_seq": "79-g1AAAAKjeJ...",
  "sizes": {
    "file": 11679510,
    "external": 23705,
    "active": 11311270
  },
  "other": {
    "data_size": 23705
  },
  "disk_size": 11679510,
  "data_size": 11311270,
  ...
}
```
([full-output.json](https://github.com/apache/couchdb/files/2410386/full-output.json.txt))


We've tried compacting it, and also reduced `_rev_limits`, it had no effect.  
Also, when replicating this database to a new one, the new size is normal:

    8,2K /var/lib/couchdb/shards/00000000-1fffffff/demo-copy.1537540293.couch

It looks like this extra size comes from many `_local` documents (there are 
17995 in this database -- but we created none).

Example local files:
```json
{
  "id": "_local/0098b2317762223afc9a21b9e9ed894b",
  "key": "_local/0098b2317762223afc9a21b9e9ed894b",
  "value": {
    "rev": "0-1"
  },
  "doc": {
    "session_id": "4549feee1de3724118b38b56ac47acbe",
    "source_last_seq": "19-g1AAAASteJzN0lFKxDAQBuCwVXRPsV4gJUnTtAXBvY...",
    "replication_id_version": 4,
    "history": [
      {
        "session_id": "4549feee1de3724118b38b56ac47acbe",
        "start_time": "Wed, 27 Jun 2018 06:56:43 GMT",
        "end_time": "Wed, 27 Jun 2018 08:56:47 GMT",
        "start_last_seq": 0,
        "end_last_seq": "19-g1AAAASteJzN0lFKxDAQBuCwVXRPsV4gJUnTtAXBv...",
        "recorded_seq": "19-g1AAAASteJzN0lFKxDAQBuCwVXRPsV4gJUnTtAXBv...",
        "missing_checked": 19,
        "missing_found": 19,
        "docs_read": 19,
        "docs_written": 19,
        "doc_write_failures": 0
      }
    ]
  }
}
```
or:
```json
{
  "id": "_local/shard-sync-5WNtkwE5tM3B2A8XgrARtA-xrMMaaBJQHXoLx6HrvkN9w",
  "key": "_local/shard-sync-5WNtkwE5tM3B2A8XgrARtA-xrMMaaBJQHXoLx6HrvkN9w",
  "value": {
    "rev": "0-1"
  },
  "doc": {
    "seq": 9,
    "target_uuid": "e41c58518830b41c2c908ce3d68dedb3",
    "history": {
      "[email protected]": [
        {
          "target_node": "[email protected]",
          "target_uuid": "e41c58518830b41c2c908ce3d68dedb3",
          "target_seq": 11,
          "source_node": "[email protected]",
          "source_uuid": "33f52231d93c3f99a2b683de7de4d111",
          "source_seq": 9,
          "timestamp": "2018-09-24T07:59:03.786497Z"
        },
        {
          "target_node": "[email protected]",
          "target_uuid": "e41c58518830b41c2c908ce3d68dedb3",
          "target_seq": 8,
          "source_node": "[email protected]",
          "source_uuid": "33f52231d93c3f99a2b683de7de4d111",
          "source_seq": 3,
          "timestamp": "2018-06-02T05:10:44.108378Z"
        }
      ]
    }
  }
}
```
(Note that timestamps are more than 3 months old.)


- Are these `_local` files the cause of the extra weight (× 26 compared to a 
fresh replicated server with the same contents)?
- Are these `_local` files needed? Is it normal that they stay on disk for 
months?
- How to prevent creation of this black matter?

Thanks in advance for helping us debugging this strange behavior. I'd be happy 
to provide more info if needed.



[ Full content available at: https://github.com/apache/couchdb/issues/1621 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to