Hi Aseem,

This is actually expected behavior when performing file copy of
running db and starting up with default configuration. If you remove
the files ending with .id in the db directory on the local snapshot
and start up setting "rebuild_idgenerators_fast=false" you should see
the accurate amount of nodes, relationships and properties.

Regarding the amount of properties not matching this could be due to a
non clean shutdown on the production system. We are planing on
improving this in the near future by allowing for more aggressive
reuse of ids for properties. This will specifically improve things for
workloads that perform a lot of property updates.

-Johan

On Tue, Aug 30, 2011 at 10:05 AM, Aseem Kishore <aseem.kish...@gmail.com> wrote:
> Hey guys,
>
> We do offline backups of our db on a semi-regular basis (every few days),
> where we (1) stop the running db, (2) copy its data directory and (3)
> restart the db.
>
> A few times early on, we did running backups -- but not the proper "online"
> way -- where we simply copied the data directory while the db was still
> running. (We did this during times where we were confident no requests were
> hitting the db.)
>
> We noticed that every time we did the running backup, the number of
> properties the web admin reported -- and the space on disk of the db --
> would jump up quite a bit. We stopped doing that recently.
>
> But even now, both these numbers have gotten quite a bit higher than we
> expect to, and strangely, they seem to differ highly between the running db
> and the copies.
>
> What could be causing all of this?
>
> Here are our current numbers:
>
> *Production*
> - 2,338 nodes
> - 4,473 rels
> - 114,231 props (higher than we would expect it to be, but not by an order
> of magnitude)
> - *1.39 GB!* <-- this is way unexpected, particularly since our db used to
> be in the ~10 KB ballpark, and we certainly haven't experienced hockey stick
> growth yet ;) The logical log only takes up 57 KB (0%) btw.
>
>
> *Local snapshot*
> - 2,338 nodes
> - 4,473 rels
> - *2,607,892 props!!!* <-- ???
> - *1.37 GB!* <-- equally surprisingly high, but also interesting that it's
> less than the production db's size. 0 KB logical logs.
>
>
> I looked around the wiki and searched this mailing list but didn't find much
> clues here. But as requested on another thread, here's the output of `ls -lh
> data/graph.db/`:
>
> total 1474520
> -rw-r--r--   1 aseemk  staff    11B Aug 30 00:46 active_tx_log
> drwxr-xr-x  52 aseemk  staff   1.7K Aug 30 00:46 index/
> -rw-r--r--   1 aseemk  staff   343B Aug 30 00:46 index.db
> -rw-r--r--   1 aseemk  staff   854K Aug 30 00:46 messages.log
> -rw-r--r--   1 aseemk  staff    36B Aug 30 00:46 neostore
> -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46 neostore.id
> -rw-r--r--   1 aseemk  staff    26K Aug 30 00:46 neostore.nodestore.db
> -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46 neostore.nodestore.db.id
> -rw-r--r--   1 aseemk  staff    62M Aug 30 00:46 neostore.propertystore.db
> -rw-r--r--   1 aseemk  staff   133B Aug 30 00:46
> neostore.propertystore.db.arrays
> -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> neostore.propertystore.db.arrays.id
> -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> neostore.propertystore.db.id
> -rw-r--r--   1 aseemk  staff   1.0K Aug 30 00:46
> neostore.propertystore.db.index
> -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> neostore.propertystore.db.index.id
> -rw-r--r--   1 aseemk  staff   4.0K Aug 30 00:46
> neostore.propertystore.db.index.keys
> -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> neostore.propertystore.db.index.keys.id
> -rw-r--r--   1 aseemk  staff    69M Aug 30 00:46
> neostore.propertystore.db.strings
> -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> neostore.propertystore.db.strings.id
> -rw-r--r--   1 aseemk  staff   144K Aug 30 00:46
> neostore.relationshipstore.db
> -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> neostore.relationshipstore.db.id
> -rw-r--r--   1 aseemk  staff    55B Aug 30 00:46
> neostore.relationshiptypestore.db
> -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> neostore.relationshiptypestore.db.id
> -rw-r--r--   1 aseemk  staff   602B Aug 30 00:46
> neostore.relationshiptypestore.db.names
> -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> neostore.relationshiptypestore.db.names.id
> -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.1
> -rw-r--r--   1 aseemk  staff     4B Aug 30 00:46 nioneo_logical.log.active
> -rw-r--r--   1 aseemk  staff   945K Aug 30 00:46 nioneo_logical.log.v0
> -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v1
> -rw-r--r--   1 aseemk  staff    33K Aug 30 00:46 nioneo_logical.log.v10
> -rw-r--r--   1 aseemk  staff    11K Aug 30 00:46 nioneo_logical.log.v11
> -rw-r--r--   1 aseemk  staff    32K Aug 30 00:46 nioneo_logical.log.v12
> -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v13
> -rw-r--r--   1 aseemk  staff    12M Aug 30 00:46 nioneo_logical.log.v14
> -rw-r--r--   1 aseemk  staff   1.4M Aug 30 00:46 nioneo_logical.log.v15
> -rw-r--r--   1 aseemk  staff   6.8M Aug 30 00:46 nioneo_logical.log.v16
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v17
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v18
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v19
> -rw-r--r--   1 aseemk  staff   1.3M Aug 30 00:46 nioneo_logical.log.v2
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v20
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v21
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v22
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v23
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v24
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v25
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v26
> -rw-r--r--   1 aseemk  staff    14M Aug 30 00:46 nioneo_logical.log.v27
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v28
> -rw-r--r--   1 aseemk  staff   7.8M Aug 30 00:46 nioneo_logical.log.v29
> -rw-r--r--   1 aseemk  staff   800K Aug 30 00:46 nioneo_logical.log.v3
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v30
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v31
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v32
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v33
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v34
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v35
> -rw-r--r--   1 aseemk  staff   4.5M Aug 30 00:46 nioneo_logical.log.v36
> -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v37
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v38
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v39
> -rw-r--r--   1 aseemk  staff    67K Aug 30 00:46 nioneo_logical.log.v4
> -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v40
> -rw-r--r--   1 aseemk  staff    16M Aug 30 00:46 nioneo_logical.log.v41
> -rw-r--r--   1 aseemk  staff    14M Aug 30 00:46 nioneo_logical.log.v42
> -rw-r--r--   1 aseemk  staff   1.0M Aug 30 00:46 nioneo_logical.log.v43
> -rw-r--r--   1 aseemk  staff   5.7M Aug 30 00:46 nioneo_logical.log.v44
> -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v5
> -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v6
> -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v7
> -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v8
> -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v9
> -rw-r--r--   1 aseemk  staff    29K Aug 30 00:46 tm_tx_log.1
> -rw-r--r--   1 aseemk  staff     0B Aug 30 00:46 tm_tx_log.2
>
>
> Looking at these numbers, I suppose the logs do add up -- is there a way to
> prune/garbage collect old logs? -- but I'm also surprised at the size of
> property stores. The latter depends on the number of properties though,
> which I'm not sure is right either, even in production. (We would see the
> property count jump by a factor of 2-3 after each running backup.)
>
> Thanks in advance for any pointers!
>
> Aseem
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to