Hi, I have no reduce on the view, and that is my only view. I *am* doing bulk inserts (1000 documents), and after each bulk insert, I access the view. (my assumption is that this will be faster than accessing the view once at the end of inserting the 3 million documents)
I know that I will get here very varying numbers, but: what is the expected view indexing time for the view that I posted and for an amount of 3 million documents? How can I monitor view creation? (how many documents have been already indexed) I got the idea that "bulk insert + view access + repeat" was faster that "full insert + view access" here: http://iamseanmurphy.com/2008/09/08/couchdb-view-generation/ Thanks, Daniel On Tue, Mar 13, 2012 at 11:58 PM, Robert Newson <[email protected]> wrote: > The view build is already batched. In my opinion your strategy A can > only ever be slower or the same speed as B. > > Try inserting the docs using _bulk_docs, it'll go much faster. I'd > fill the database up and hit the view at the end for the fastest build > time, but I'd still expect it take a while to build the view the first > time. > > Do you have a reduce on the view? Are there other views in the same > design document? > > B. > > On 13 March 2012 22:45, Daniel Gonzalez <[email protected]> wrote: > > Hi, > > > > I am creating a database with lots of documents (3 million). > > I have a view in the database: > > > > function(doc) { > > if (doc.PORTED_NUMBER) emit(doc.PORTED_NUMBER, > doc.RECEIVING_OPERATOR); > > } > > > > To speed up view creation, I am doing the following (Strategy A) > > > > 1. Define view > > 2. Insert 1000 documents > > 3. Access the view > > 4. Goto 2 > > > > And I repeat this process until all documents have been inserted. > > > > I have read that this is faster than my previous strategy (Strategy B, > > obsolete): > > > > 1. Insert all documents > > 2. Define view > > 3. Access view > > > > My problem is that, in my current Strategy A, step 3 is taking longer and > > longer. Currently I have around 300 thousand documents inserted and view > > access is taking around 120s. > > The evolution of the delay in view access has been: > > > > 2012-03-13 23:01:40,405 - __main__ - INFO - - > > BulkSend >> requested= 1000 ok= 1000 errors= 0 > > 2012-03-13 23:03:29,589 - __main__ - INFO - - > View > > ready, ellapsed 109 > > 2012-03-13 23:03:32,945 - __main__ - INFO - - > > BulkSend >> requested= 1000 ok= 1000 errors= 0 > > 2012-03-13 23:05:31,699 - __main__ - INFO - - > View > > ready, ellapsed 118 > > 2012-03-13 23:05:35,106 - __main__ - INFO - - > > BulkSend >> requested= 1000 ok= 1000 errors= 0 > > 2012-03-13 23:07:28,392 - __main__ - INFO - - > View > > ready, ellapsed 113 > > 2012-03-13 23:07:31,663 - __main__ - INFO - - > > BulkSend >> requested= 1000 ok= 1000 errors= 0 > > 2012-03-13 23:09:26,929 - __main__ - INFO - - > View > > ready, ellapsed 115 > > 2012-03-13 23:09:30,572 - __main__ - INFO - - > > BulkSend >> requested= 1000 ok= 1000 errors= 0 > > 2012-03-13 23:11:27,490 - __main__ - INFO - - > View > > ready, ellapsed 116 > > 2012-03-13 23:11:30,784 - __main__ - INFO - - > > BulkSend >> requested= 1000 ok= 1000 errors= 0 > > 2012-03-13 23:13:21,575 - __main__ - INFO - - > View > > ready, ellapsed 110 > > 2012-03-13 23:13:24,937 - __main__ - INFO - - > > BulkSend >> requested= 1000 ok= 1000 errors= 0 > > 2012-03-13 23:15:23,519 - __main__ - INFO - - > View > > ready, ellapsed 118 > > 2012-03-13 23:15:26,836 - __main__ - INFO - - > > BulkSend >> requested= 1000 ok= 1000 errors= 0 > > 2012-03-13 23:17:23,036 - __main__ - INFO - - > View > > ready, ellapsed 116 > > 2012-03-13 23:17:26,310 - __main__ - INFO - - > > BulkSend >> requested= 1000 ok= 1000 errors= 0 > > > > It started with around 1s, and it is increasing more or less > monotonically. > > It is already running since 7 hours ago, and only 300000 documents have > > been imported and indexed. > > If everything continues like this (I do not know what kind of matematical > > function this is following, but for me it seems like an exponential > > function), importing the 3 million of documents is going to take forever. > > > > Is there a way to speed this up? > > > > Thanks! > > Daniel >
