Thanks for the insights Johan!

Regarding the existing disk space then, by far the bulk of it is from the
logs. Is there a way to prune or garbage collect them? Is simply deleting
the files safe? Should the db be off if I do that? Etc.

Thanks much!

Aseem

On Tue, Aug 30, 2011 at 2:47 AM, Johan Svensson <[email protected]>wrote:

> Hi Aseem,
>
> This is actually expected behavior when performing file copy of
> running db and starting up with default configuration. If you remove
> the files ending with .id in the db directory on the local snapshot
> and start up setting "rebuild_idgenerators_fast=false" you should see
> the accurate amount of nodes, relationships and properties.
>
> Regarding the amount of properties not matching this could be due to a
> non clean shutdown on the production system. We are planing on
> improving this in the near future by allowing for more aggressive
> reuse of ids for properties. This will specifically improve things for
> workloads that perform a lot of property updates.
>
> -Johan
>
> On Tue, Aug 30, 2011 at 10:05 AM, Aseem Kishore <[email protected]>
> wrote:
> > Hey guys,
> >
> > We do offline backups of our db on a semi-regular basis (every few days),
> > where we (1) stop the running db, (2) copy its data directory and (3)
> > restart the db.
> >
> > A few times early on, we did running backups -- but not the proper
> "online"
> > way -- where we simply copied the data directory while the db was still
> > running. (We did this during times where we were confident no requests
> were
> > hitting the db.)
> >
> > We noticed that every time we did the running backup, the number of
> > properties the web admin reported -- and the space on disk of the db --
> > would jump up quite a bit. We stopped doing that recently.
> >
> > But even now, both these numbers have gotten quite a bit higher than we
> > expect to, and strangely, they seem to differ highly between the running
> db
> > and the copies.
> >
> > What could be causing all of this?
> >
> > Here are our current numbers:
> >
> > *Production*
> > - 2,338 nodes
> > - 4,473 rels
> > - 114,231 props (higher than we would expect it to be, but not by an
> order
> > of magnitude)
> > - *1.39 GB!* <-- this is way unexpected, particularly since our db used
> to
> > be in the ~10 KB ballpark, and we certainly haven't experienced hockey
> stick
> > growth yet ;) The logical log only takes up 57 KB (0%) btw.
> >
> >
> > *Local snapshot*
> > - 2,338 nodes
> > - 4,473 rels
> > - *2,607,892 props!!!* <-- ???
> > - *1.37 GB!* <-- equally surprisingly high, but also interesting that
> it's
> > less than the production db's size. 0 KB logical logs.
> >
> >
> > I looked around the wiki and searched this mailing list but didn't find
> much
> > clues here. But as requested on another thread, here's the output of `ls
> -lh
> > data/graph.db/`:
> >
> > total 1474520
> > -rw-r--r--   1 aseemk  staff    11B Aug 30 00:46 active_tx_log
> > drwxr-xr-x  52 aseemk  staff   1.7K Aug 30 00:46 index/
> > -rw-r--r--   1 aseemk  staff   343B Aug 30 00:46 index.db
> > -rw-r--r--   1 aseemk  staff   854K Aug 30 00:46 messages.log
> > -rw-r--r--   1 aseemk  staff    36B Aug 30 00:46 neostore
> > -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46 neostore.id
> > -rw-r--r--   1 aseemk  staff    26K Aug 30 00:46 neostore.nodestore.db
> > -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> neostore.nodestore.db.id
> > -rw-r--r--   1 aseemk  staff    62M Aug 30 00:46
> neostore.propertystore.db
> > -rw-r--r--   1 aseemk  staff   133B Aug 30 00:46
> > neostore.propertystore.db.arrays
> > -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> > neostore.propertystore.db.arrays.id
> > -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> > neostore.propertystore.db.id
> > -rw-r--r--   1 aseemk  staff   1.0K Aug 30 00:46
> > neostore.propertystore.db.index
> > -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> > neostore.propertystore.db.index.id
> > -rw-r--r--   1 aseemk  staff   4.0K Aug 30 00:46
> > neostore.propertystore.db.index.keys
> > -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> > neostore.propertystore.db.index.keys.id
> > -rw-r--r--   1 aseemk  staff    69M Aug 30 00:46
> > neostore.propertystore.db.strings
> > -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> > neostore.propertystore.db.strings.id
> > -rw-r--r--   1 aseemk  staff   144K Aug 30 00:46
> > neostore.relationshipstore.db
> > -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> > neostore.relationshipstore.db.id
> > -rw-r--r--   1 aseemk  staff    55B Aug 30 00:46
> > neostore.relationshiptypestore.db
> > -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> > neostore.relationshiptypestore.db.id
> > -rw-r--r--   1 aseemk  staff   602B Aug 30 00:46
> > neostore.relationshiptypestore.db.names
> > -rw-r--r--   1 aseemk  staff     9B Aug 30 00:46
> > neostore.relationshiptypestore.db.names.id
> > -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.1
> > -rw-r--r--   1 aseemk  staff     4B Aug 30 00:46
> nioneo_logical.log.active
> > -rw-r--r--   1 aseemk  staff   945K Aug 30 00:46 nioneo_logical.log.v0
> > -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v1
> > -rw-r--r--   1 aseemk  staff    33K Aug 30 00:46 nioneo_logical.log.v10
> > -rw-r--r--   1 aseemk  staff    11K Aug 30 00:46 nioneo_logical.log.v11
> > -rw-r--r--   1 aseemk  staff    32K Aug 30 00:46 nioneo_logical.log.v12
> > -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v13
> > -rw-r--r--   1 aseemk  staff    12M Aug 30 00:46 nioneo_logical.log.v14
> > -rw-r--r--   1 aseemk  staff   1.4M Aug 30 00:46 nioneo_logical.log.v15
> > -rw-r--r--   1 aseemk  staff   6.8M Aug 30 00:46 nioneo_logical.log.v16
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v17
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v18
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v19
> > -rw-r--r--   1 aseemk  staff   1.3M Aug 30 00:46 nioneo_logical.log.v2
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v20
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v21
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v22
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v23
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v24
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v25
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v26
> > -rw-r--r--   1 aseemk  staff    14M Aug 30 00:46 nioneo_logical.log.v27
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v28
> > -rw-r--r--   1 aseemk  staff   7.8M Aug 30 00:46 nioneo_logical.log.v29
> > -rw-r--r--   1 aseemk  staff   800K Aug 30 00:46 nioneo_logical.log.v3
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v30
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v31
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v32
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v33
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v34
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v35
> > -rw-r--r--   1 aseemk  staff   4.5M Aug 30 00:46 nioneo_logical.log.v36
> > -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v37
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v38
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v39
> > -rw-r--r--   1 aseemk  staff    67K Aug 30 00:46 nioneo_logical.log.v4
> > -rw-r--r--   1 aseemk  staff    25M Aug 30 00:46 nioneo_logical.log.v40
> > -rw-r--r--   1 aseemk  staff    16M Aug 30 00:46 nioneo_logical.log.v41
> > -rw-r--r--   1 aseemk  staff    14M Aug 30 00:46 nioneo_logical.log.v42
> > -rw-r--r--   1 aseemk  staff   1.0M Aug 30 00:46 nioneo_logical.log.v43
> > -rw-r--r--   1 aseemk  staff   5.7M Aug 30 00:46 nioneo_logical.log.v44
> > -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v5
> > -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v6
> > -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v7
> > -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v8
> > -rw-r--r--   1 aseemk  staff    16B Aug 30 00:46 nioneo_logical.log.v9
> > -rw-r--r--   1 aseemk  staff    29K Aug 30 00:46 tm_tx_log.1
> > -rw-r--r--   1 aseemk  staff     0B Aug 30 00:46 tm_tx_log.2
> >
> >
> > Looking at these numbers, I suppose the logs do add up -- is there a way
> to
> > prune/garbage collect old logs? -- but I'm also surprised at the size of
> > property stores. The latter depends on the number of properties though,
> > which I'm not sure is right either, even in production. (We would see the
> > property count jump by a factor of 2-3 after each running backup.)
> >
> > Thanks in advance for any pointers!
> >
> > Aseem
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to