On Wed, Jan 13, 2010 at 11:34 AM, Adam Kocoloski (JIRA) <[email protected]> wrote:
>
>    [ 
> https://issues.apache.org/jira/browse/COUCHDB-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799896#action_12799896
>  ]
>
> Adam Kocoloski commented on COUCHDB-623:
> ----------------------------------------
>
> I believe by "consistency guarantees" Chris meant that a view request uses a 
> single snapshot of the view index for the entire response.  Even if documents 
> are changed in the interim, and even if someone else has triggered a view 
> update, your response will still accurately reflect the state of the DB at a 
> single moment in time.

Thanks Adam, that's exactly what I'm talking about.

>
>> File format for views is space and time inefficient - use a better one
>> ----------------------------------------------------------------------
>>
>>                 Key: COUCHDB-623
>>                 URL: https://issues.apache.org/jira/browse/COUCHDB-623
>>             Project: CouchDB
>>          Issue Type: Improvement
>>          Components: Database Core
>>    Affects Versions: 0.10
>>            Reporter: Roger Binns
>>
>> This was discussed on the dev mailing list over the last few days and noted 
>> here so it isn't forgotten.
>> The main database file format is optimised for data integrity - not losing 
>> or mangling documents - and rightly so.
>> That same append-only format is also used for views where it is a poor fit.  
>> The more random the ordering of data supplied, the larger the btree.  The 
>> larger the keys (in bytes) the larger the btree.  As an example my 2GB of 
>> raw JSON data turns into a 3.9GB CouchDB database but a 27GB view file 
>> (before compacting to 900MB).  Since views are not replicated, this requires 
>> a disproportionate amount of disk space on each receiving server (not to 
>> mention I/O load).  The format also affects view generation performance.  By 
>> loading my documents into CouchDB in an order by the most emitted value in 
>> views I was able to reduce load time from 75 minutes to 40 minutes with the 
>> view file size being 15GB instead of 27GB, but still very distant from the 
>> 900MB post compaction.
>> Views are a performance enhancement.  They save you from having to visit 
>> every document when doing some queries.  The data within in a view is 
>> generated and hence the only consequence of losing view data is a 
>> performance one and the view can be regenerated anyway.  Consequently the 
>> file format should be one that is optimised for performance and size.  The 
>> only integrity feature needed is the ability to tell that the view is 
>> potentially corrupt (eg the power failed while it was being 
>> generated/updated).
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Reply via email to