but still does not explain the huge difference in size between (1) and (2)
given the fact that the docs are simple jsons and no document had
2 revisions ever.

Compaction means discarding all revisions but the latest(newest)

/Bogdan

On Mon, Oct 10, 2016 at 1:05 PM, Jan Lehnardt <[email protected]> wrote:

>
> > On 10 Oct 2016, at 11:54, Bogdan Andu <[email protected]> wrote:
> >
> > Hi,
> >
> > I return with updated info :
> >
> > I compacted db1 (CouchDB/1.6.1) on the source and now has 350 MB from
> 2.5 GB
> > with 362849 no. of documents
> > I also compacted the views but no difference in size .
> >
> > The database stores documents of the following form:
> >
> > {
> >   "_id": "00006df04672a0c0e0da142ad8cd90b9",
> >   "_rev": "1-a14afd34d5a52e3f6ae515c9adcff2d3",
> >   "local_id": "110361",
> >   "email": "[email protected]",
> >   "sent_date": "2007-06-29 12:20:31",
> >   "regtype": "n"
> > }
> >
> > Huge difference between 2.5GB and 350 MB and the
> > documents had no revisions.
> >
> > If Couch is able to reduce a db's size to this magnitude after compaction
> > why cannot maintain the aprox. the same size limit during
> > normal operations(there are no deletions, no updates , only insertions).
>
> For CouchDB Compaction is considered normal operation.
>
> >
> > Maybe the b-tree is optimized only after compaction, and not during
> > repetitive insertions
> >
> > (aprox. 2000 insertions/day).
> >
> > and for the sake of consistency..
> >
> > after replication to 2.0 couchdb, the same database
> > (with views generated took ~ 20 minutes / 362849 docs), we have:
> >
> > 69.3 MB / 362849 documents
> >
> > Now the big surprise is the huge difference in
> > size resulted after compaction on 1.6.1
> >
> >
> > to summarize :
> >
> > (1) 1.6.1     original             2.5 GB      362849 docs
> >
> > (2) 1.6.1     compacted            350 MB      362849 docs
> >
> > (3) 2.0       replicate (from (1)) 69.3 MB     362849 docs
>
> These numbers confirm the significant improvements that were done
> to the compactor for 2.0. I’m glad it’s showing for you :)
>
> https://blog.couchdb.org/2016/08/10/
>
> Best
> Jan
> --
>
>
> >
> > /Bogdan
> >
> >
> >
> >
> >
> >
> > On Fri, Oct 7, 2016 at 4:43 PM, Adam Kocoloski <[email protected]>
> wrote:
> >
> >> Lots of good questions there.
> >>
> >> On the storage size, note that even when you write only one revision of
> >> each document the database will accumulate some wasted space. Inserts to
> >> the database cause internal btree structures to be updated, and due to
> the
> >> copy-on-write nature of the storage engine the old btree nodes are left
> >> around in the file.
> >>
> >> We did make some changes in the compaction system that produce smaller
> >> files at the end of the day. You can read more about those changes here
> -
> >> https://blog.couchdb.org/2016/08/10/feature-compaction/ <
> >> https://blog.couchdb.org/2016/08/10/feature-compaction/> - but they
> don’t
> >> explain the difference that you reported. Perhaps you didn’t compact the
> >> source database at all?
> >>
> >> You are correct that both design documents and mango will build
> >> btree-based indexes to answer their queries. I would like to see us add
> >> functionality to mango over time so that it can cover the large
> majority of
> >> use cases where folks need to appeal to views in design documents, but
> >> we’re not quite there yet. One example where mango cannot help you
> today is
> >> reduce functions; if you want to aggregate the values in your index you
> >> need to drop down and build a view for that.
> >>
> >> In terms of performance, mango should be moderately faster at building
> an
> >> index because there’s no JavaScript roundtrip. Querying performance
> should
> >> be ~identical. Cheers,
> >>
> >> Adam
> >>
> >>> On Oct 7, 2016, at 7:56 AM, Thanos Vassilakis <[email protected]>
> wrote:
> >>>
> >>> Good questions
> >>>
> >>> Sent from my iPhone
> >>>
> >>>> On Oct 7, 2016, at 5:29 AM, Bogdan Andu <[email protected]> wrote:
> >>>>
> >>>> I see the data management is totally different(and better).
> >>>> now there is a _dbs.couch for a registry-like database for databases
> >>>> and actual databases are located in data/shards subdirectories.
> >>>>
> >>>> so.. only replication works here..
> >>>> and one can replicate many databases in parallel.
> >>>>
> >>>> another difference I see is the size of databases.
> >>>>
> >>>> 2.0 version keep a very small size of databases compared to 1.6.1
> >> version.
> >>>>
> >>>> Is there any change in storage engine that makes so big differences in
> >>>> database sizes?
> >>>>
> >>>> all records in db1 in 1.6.1 have only one revision like (1-...) format
> >>>>
> >>>> db1 in 1.6.1 is 2.5GB with 362849 records
> >>>> after replication:
> >>>> db1 in 2.0 has 69.3 MB with 362849 records
> >>>>
> >>>> when is recommended to use design documents and when mango queries.
> >>>> is mango intended to replace design documents although I assume both
> >>>> build a view tree for the query in question.
> >>>>
> >>>> which one is faster?
> >>>> what are the use-cases for each one of the query methods?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Bogdan
> >>>>
> >>>>
> >>>>
> >>>>> On Fri, Oct 7, 2016 at 11:20 AM, max <[email protected]> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> Install 2.0 version on another server or just make it listen on
> >> different
> >>>>> port than 1.6 then replicate your data ;)
> >>>>>
> >>>>> 2016-10-07 9:49 GMT+02:00 Bogdan Andu <[email protected]>:
> >>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> I configured a single-node CouchDB 2.0 instance and
> >>>>>> I copied in data directory 1.6.1 couch databases.
> >>>>>>
> >>>>>> But the databases does not show up in Fauxton, only the
> >>>>>> test databases:
> >>>>>>
> >>>>>> ["_global_changes","_replicator","_users","verifytestdb"].
> >>>>>>
> >>>>>> Is there a way to make CouchDB 2.0 read 1.6.1 couch files
> >>>>>>
> >>>>>> without importing?
> >>>>>>
> >>>>>> /Bogdan
> >>>>>
> >>
> >>
>
> --
> Professional Support for Apache CouchDB:
> https://neighbourhood.ie/couchdb-support/
>
>

Reply via email to