On Apr 27, 2008, at 03:12, Anthony Mills wrote:
Thank you everyone for answering my questions.

Here is the way I understand it. The first time a view is run it creates a key-values list from all documents. Future calls to the view, update the key-value list with changed documents [added, deleted, updated]. If a startkey, endkey or key is used, only those keys that match in the list are returned.

Where "match" mean either single entries from the view-index or consecutive ranges, but noting with gaps.


If I use a different startkey, endkey or key, the key-value list is not rebuilt, it uses the keys from the first view.
Did I get it right?

Yes.


Sorry about being obtuse. I have a project that can have over 10 million documents and I need to understand how they can be indexed.

As Chris mentioned, it might be best to play around with sample data to get a feel for views. Check out Futon, our built in administration client, it lets you define ad-hoc queries that you can modify at your will and later save permenently: http://localhost:5984/_utils/

Cheers
Jan
--




Thank you,
Anthony

On Apr 26, 2008, at 2:11 PM, Jan Lehnardt wrote:

Heya Anthony,
On Apr 26, 2008, at 20:50, Anthony Mills wrote:
Maybe I missing something. When you create a view, does it create indexes for attributes in the database? When you add new documents, do they automatically create the index for the attributes for the view?

A view index only has a single index which is what you send in as the first argument in the map() function. Nothing else is going on automatically.


Also, can I call my view with soemthing like ? startkey=['20080403t000000', 1234]&endkey=['20080405t235959', 1234] to

function(doc){
        if(doc.type == "hello"){
                map([doc.date, doc.number], doc);
        }
}

Then, through the magic of couchdb, I'll only get back those documents between the April 3rd and 5th whose attribute number=1234?

Nope, you'd need a [doc.number, doc.date] index for that. It is rather straightforward than magical. The map() function just creates a key-value list that is sorted by key and you can query only ranges within the key-space.


Will couchdb only search through records that match the key? or will it need to go through all documents every time I call the view?

To build the view index CouchDB will go through all documents. But only once. For documents that change, get deleted or added, CouchDB incrementally updates the index. Also, view indexes are build when you query the view, not when you add documents.


To get nerdy, I want my views to find records in O(log n) not O(n).

You get your results in O(1) ;-) (after the first query to each view).

In relational terms, think of a view as an index on a column without the write penalty. So have as much as you might need.

I hope that helps, feel free to send more questions :)

Cheers
Jan
--





Thanks,

Anthony

On Apr 26, 2008, at 1:02 AM, Chris Anderson wrote:

Anthony,

http://wiki.apache.org/couchdb/ViewCollation is the way to accomplish
tasks like that.

Christopher Lenz has a write-up of how to use view collation to sort
views, achieving comments grouped by parent blog post.

http://www.cmlenz.net/archives/2007/10/couchdb-joins

In your case you could index a view with date and type, like this

[type, date]

and then if you had say 5 types you'd do 5 GET queries against the
database, each one fetching only the documents for that day.

View collation is one of my favorite things about CouchDB. I'm excited
about reduce, because from what I understand, you could use it to
lower this to 1 GET, if that's important to you.

enjoy,
Chris

On Fri, Apr 25, 2008 at 9:34 PM, Anthony Mills <[EMAIL PROTECTED] > wrote:
I read most of the documentation, wiki and blogs, but I still do not see how to accomplish a certain scenario. Hopefully I can describe it adiquitely.

Lets say I have 1,000,000 documents [all of the same "type"] with a date attribute. Lets say I want to pick a subset of those documents. How can I pick those documents of one type that fall on one day? Will I need to get all 1,000,000 documents? What if I want all documents of one type on one
day that match another attribute?

I pretty sure this is what map/reduce will help with, but is there a way to
do this now?  Can you use more documents to build date relations?

Also, can you pass more variables than just key to






Reply via email to