Re: Creating a database with lots of documents and updating a view

Robert Newson Tue, 13 Mar 2012 15:58:43 -0700

The view build is already batched. In my opinion your strategy A can
only ever be slower or the same speed as B.


Try inserting the docs using _bulk_docs, it'll go much faster. I'd
fill the database up and hit the view at the end for the fastest build
time, but I'd still expect it take a while to build the view the first
time.

Do you have a reduce on the view? Are there other views in the same
design document?

B.

On 13 March 2012 22:45, Daniel Gonzalez <[email protected]> wrote:
> Hi,
>
> I am creating a database with lots of documents (3 million).
> I have a view in the database:
>
> function(doc) {
>    if (doc.PORTED_NUMBER) emit(doc.PORTED_NUMBER, doc.RECEIVING_OPERATOR);
> }
>
> To speed up view creation, I am doing the following (Strategy A)
>
>   1. Define view
>   2. Insert 1000 documents
>   3. Access the view
>   4. Goto 2
>
> And I repeat this process until all documents have been inserted.
>
> I have read that this is faster than my previous strategy (Strategy B,
> obsolete):
>
>   1. Insert all documents
>   2. Define view
>   3. Access view
>
> My problem is that, in my current Strategy A, step 3 is taking longer and
> longer. Currently I have around 300 thousand documents inserted and view
> access is taking around 120s.
> The evolution of the delay in view access has been:
>
> 2012-03-13 23:01:40,405 - __main__             - INFO       -       -
> BulkSend >> requested=   1000 ok=   1000 errors=      0
> 2012-03-13 23:03:29,589 - __main__             - INFO       -       - View
> ready, ellapsed 109
> 2012-03-13 23:03:32,945 - __main__             - INFO       -       -
> BulkSend >> requested=   1000 ok=   1000 errors=      0
> 2012-03-13 23:05:31,699 - __main__             - INFO       -       - View
> ready, ellapsed 118
> 2012-03-13 23:05:35,106 - __main__             - INFO       -       -
> BulkSend >> requested=   1000 ok=   1000 errors=      0
> 2012-03-13 23:07:28,392 - __main__             - INFO       -       - View
> ready, ellapsed 113
> 2012-03-13 23:07:31,663 - __main__             - INFO       -       -
> BulkSend >> requested=   1000 ok=   1000 errors=      0
> 2012-03-13 23:09:26,929 - __main__             - INFO       -       - View
> ready, ellapsed 115
> 2012-03-13 23:09:30,572 - __main__             - INFO       -       -
> BulkSend >> requested=   1000 ok=   1000 errors=      0
> 2012-03-13 23:11:27,490 - __main__             - INFO       -       - View
> ready, ellapsed 116
> 2012-03-13 23:11:30,784 - __main__             - INFO       -       -
> BulkSend >> requested=   1000 ok=   1000 errors=      0
> 2012-03-13 23:13:21,575 - __main__             - INFO       -       - View
> ready, ellapsed 110
> 2012-03-13 23:13:24,937 - __main__             - INFO       -       -
> BulkSend >> requested=   1000 ok=   1000 errors=      0
> 2012-03-13 23:15:23,519 - __main__             - INFO       -       - View
> ready, ellapsed 118
> 2012-03-13 23:15:26,836 - __main__             - INFO       -       -
> BulkSend >> requested=   1000 ok=   1000 errors=      0
> 2012-03-13 23:17:23,036 - __main__             - INFO       -       - View
> ready, ellapsed 116
> 2012-03-13 23:17:26,310 - __main__             - INFO       -       -
> BulkSend >> requested=   1000 ok=   1000 errors=      0
>
> It started with around 1s, and it is increasing more or less monotonically.
> It is already running since 7 hours ago, and only 300000 documents have
> been imported and indexed.
> If everything continues like this (I do not know what kind of matematical
> function this is following, but for me it seems like an exponential
> function), importing the 3 million of documents is going to take forever.
>
> Is there a way to speed this up?
>
> Thanks!
> Daniel

Re: Creating a database with lots of documents and updating a view

Reply via email to