but still does not explain the huge difference in size between (1) and (2) given the fact that the docs are simple jsons and no document had 2 revisions ever.
Compaction means discarding all revisions but the latest(newest) /Bogdan On Mon, Oct 10, 2016 at 1:05 PM, Jan Lehnardt <[email protected]> wrote: > > > On 10 Oct 2016, at 11:54, Bogdan Andu <[email protected]> wrote: > > > > Hi, > > > > I return with updated info : > > > > I compacted db1 (CouchDB/1.6.1) on the source and now has 350 MB from > 2.5 GB > > with 362849 no. of documents > > I also compacted the views but no difference in size . > > > > The database stores documents of the following form: > > > > { > > "_id": "00006df04672a0c0e0da142ad8cd90b9", > > "_rev": "1-a14afd34d5a52e3f6ae515c9adcff2d3", > > "local_id": "110361", > > "email": "[email protected]", > > "sent_date": "2007-06-29 12:20:31", > > "regtype": "n" > > } > > > > Huge difference between 2.5GB and 350 MB and the > > documents had no revisions. > > > > If Couch is able to reduce a db's size to this magnitude after compaction > > why cannot maintain the aprox. the same size limit during > > normal operations(there are no deletions, no updates , only insertions). > > For CouchDB Compaction is considered normal operation. > > > > > Maybe the b-tree is optimized only after compaction, and not during > > repetitive insertions > > > > (aprox. 2000 insertions/day). > > > > and for the sake of consistency.. > > > > after replication to 2.0 couchdb, the same database > > (with views generated took ~ 20 minutes / 362849 docs), we have: > > > > 69.3 MB / 362849 documents > > > > Now the big surprise is the huge difference in > > size resulted after compaction on 1.6.1 > > > > > > to summarize : > > > > (1) 1.6.1 original 2.5 GB 362849 docs > > > > (2) 1.6.1 compacted 350 MB 362849 docs > > > > (3) 2.0 replicate (from (1)) 69.3 MB 362849 docs > > These numbers confirm the significant improvements that were done > to the compactor for 2.0. I’m glad it’s showing for you :) > > https://blog.couchdb.org/2016/08/10/ > > Best > Jan > -- > > > > > > /Bogdan > > > > > > > > > > > > > > On Fri, Oct 7, 2016 at 4:43 PM, Adam Kocoloski <[email protected]> > wrote: > > > >> Lots of good questions there. > >> > >> On the storage size, note that even when you write only one revision of > >> each document the database will accumulate some wasted space. Inserts to > >> the database cause internal btree structures to be updated, and due to > the > >> copy-on-write nature of the storage engine the old btree nodes are left > >> around in the file. > >> > >> We did make some changes in the compaction system that produce smaller > >> files at the end of the day. You can read more about those changes here > - > >> https://blog.couchdb.org/2016/08/10/feature-compaction/ < > >> https://blog.couchdb.org/2016/08/10/feature-compaction/> - but they > don’t > >> explain the difference that you reported. Perhaps you didn’t compact the > >> source database at all? > >> > >> You are correct that both design documents and mango will build > >> btree-based indexes to answer their queries. I would like to see us add > >> functionality to mango over time so that it can cover the large > majority of > >> use cases where folks need to appeal to views in design documents, but > >> we’re not quite there yet. One example where mango cannot help you > today is > >> reduce functions; if you want to aggregate the values in your index you > >> need to drop down and build a view for that. > >> > >> In terms of performance, mango should be moderately faster at building > an > >> index because there’s no JavaScript roundtrip. Querying performance > should > >> be ~identical. Cheers, > >> > >> Adam > >> > >>> On Oct 7, 2016, at 7:56 AM, Thanos Vassilakis <[email protected]> > wrote: > >>> > >>> Good questions > >>> > >>> Sent from my iPhone > >>> > >>>> On Oct 7, 2016, at 5:29 AM, Bogdan Andu <[email protected]> wrote: > >>>> > >>>> I see the data management is totally different(and better). > >>>> now there is a _dbs.couch for a registry-like database for databases > >>>> and actual databases are located in data/shards subdirectories. > >>>> > >>>> so.. only replication works here.. > >>>> and one can replicate many databases in parallel. > >>>> > >>>> another difference I see is the size of databases. > >>>> > >>>> 2.0 version keep a very small size of databases compared to 1.6.1 > >> version. > >>>> > >>>> Is there any change in storage engine that makes so big differences in > >>>> database sizes? > >>>> > >>>> all records in db1 in 1.6.1 have only one revision like (1-...) format > >>>> > >>>> db1 in 1.6.1 is 2.5GB with 362849 records > >>>> after replication: > >>>> db1 in 2.0 has 69.3 MB with 362849 records > >>>> > >>>> when is recommended to use design documents and when mango queries. > >>>> is mango intended to replace design documents although I assume both > >>>> build a view tree for the query in question. > >>>> > >>>> which one is faster? > >>>> what are the use-cases for each one of the query methods? > >>>> > >>>> Thanks, > >>>> > >>>> Bogdan > >>>> > >>>> > >>>> > >>>>> On Fri, Oct 7, 2016 at 11:20 AM, max <[email protected]> wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> Install 2.0 version on another server or just make it listen on > >> different > >>>>> port than 1.6 then replicate your data ;) > >>>>> > >>>>> 2016-10-07 9:49 GMT+02:00 Bogdan Andu <[email protected]>: > >>>>> > >>>>>> Hello, > >>>>>> > >>>>>> I configured a single-node CouchDB 2.0 instance and > >>>>>> I copied in data directory 1.6.1 couch databases. > >>>>>> > >>>>>> But the databases does not show up in Fauxton, only the > >>>>>> test databases: > >>>>>> > >>>>>> ["_global_changes","_replicator","_users","verifytestdb"]. > >>>>>> > >>>>>> Is there a way to make CouchDB 2.0 read 1.6.1 couch files > >>>>>> > >>>>>> without importing? > >>>>>> > >>>>>> /Bogdan > >>>>> > >> > >> > > -- > Professional Support for Apache CouchDB: > https://neighbourhood.ie/couchdb-support/ > >
