I use "insert or replace" heavily. In debug mode, I set it to use temp
file on disk so, I can watch the disk IO, in release mode I set it to
temp file in memory.

The danger is that if you ever do anything that requires a bunch of
temp file, you can easily run out of RAM. Adding and deleting indexes
for example while temp store is set to memory will run you out of
address space on a larger DB and a 32 bit system (windows).

In  my  testing, "insert or replace" uses temp store most of the time.



Wednesday, January 23, 2013, 9:16:42 AM, you wrote:

DK> On 01/23/2013 04:20 AM, Heiles, Katrina wrote:
>> I am in the process of updating from 3.6.4 to the latest and greatest 
>> version (finally :-) ).
>> While running performance unit tests, i found a major regression (10K/sec vs 
>> 30k/sec)
>> on a common use case for us.  The regression occurs when using "insert or 
>> replace".
>>
>> I narrowed this down as follows:
>> - the regression was introduced in version 3.6.19, and has been there ever 
>> since.
>> - If i comment out line 74643 in sqlite3.c of version 3.6.19, performance is 
>> good again.
>>
>> QUESTION:  I realize that commenting this line out is not the correct 
>> solution.
>>             Is there any way to improve performance of "insert or replace"?

DK> Out of interest, is performance improved any with 3.7.15.2 if you
DK> execute "PRAGMA temp_store = memory" before calling _do_batch_insert()?

DK> Thanks,
DK> Dan.




>>
>>
>> I am including 2 other attachments:
>> ------------------------------------
>> 1. test_batch_insert.c
>>    - this is the unit test i created to reproduce the issue.  It output
>> 2. notes.txt
>>    - this contains the performance output from running test_batch_insert.c
>>      on 3 different versions of sqlite
>>        - 3.6.18  (fast)
>>        - 3.6.19  (slow)
>>        - 3.6.19 with line 74643 commented out  (fast)
>>
>>
>> Below are detailed (but cryptic) notes on what causes the issue.  Please let 
>> me know if you need more info.
>>
>> system:
>> Linux 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:59:20 EDT 2006 x86_64 x86_64 
>> x86_64 GNU/Linux
>>
>> NOTE: all line number references are based on 3_6_19 version of sqlite3.c,
>> where the regression was first introduced.
>>
>> ROOT CAUSE:
>> If I remove one line of code from sqlite3.c (line 74643) that was added in 
>> this release, performance
>> regression is resolved.
>> -----------------------------------------------------------------------------------------------
>>    74640       default: {
>>    74641         Trigger *pTrigger = 0;
>>    74642         assert( onError==OE_Replace );
>>    74643         sqlite3MultiWrite(pParse);<----------- THIS IS THE GUILTY 
>> LINE!!!!
>>    74644         if( pParse->db->flags&SQLITE_RecTriggers ){
>>    74645           pTrigger = sqlite3TriggersExist(pParse, pTab, TK_DELETE, 
>> 0, 0);
>>    74646         }
>>    74647         sqlite3GenerateRowDelete(
>>    74648             pParse, pTab, baseCur, regR, 0, pTrigger, OE_Replace
>>    74649         );
>>    74650         seenReplace = 1;
>>    74651         break;
>>    74652       }
>> -----------------------------------------------------------------------------------------------
>>
>>
>> DETAILS OF WHY THIS LINE CAUSES THE REGRESSION:
>> The effect of including line 74643 when running a batch insert:
>>
>> HIGH LEVEL:
>> in pager_write function, we<sometimes>  end up going into this if statement, 
>> which creates a
>> subjournal, causing creation of etilqs_xxx file.
>> NOTE: using the attached test_batch_insert.c file with max_entries set to 
>> 40000, this
>>        results in creation of 15 etilqs_xxx  and 262122 MORE writes!!
>> -----------------------------------------------------------------------------------------------
>>        35536     /* If the statement journal is open and the page is not in 
>> it,
>>        35537     ** then write the current page to the statement journal.  
>> Note that
>>        35538     ** the statement journal format differs from the standard 
>> journal format
>>        35539     ** in that it omits the checksums and the header.
>>        35540     */
>>        35541     if( subjRequiresPage(pPg) ){
>>        35542       rc = subjournalPage(pPg);<---- we create a subjournal
>>        35543     }
>>        35544   }
>> -----------------------------------------------------------------------------------------------
>>
>>
>> LOWER LEVEL DETAILS OF WHY subJRequiresPage IS CALLED WHEN LINE 74643 IS IN 
>> THE CODE
>> when running sample test_batch_insert.c file (included in this bug)
>> The reason that subjRequiresPage() returns true is that when
>> sqlite3GenerateConstraintChecks is called from sqlite3Insert,
>> this calls the guilty line,(74643)sqlite3MultiWrite(pParse);
>>
>>      - This sets pParse->isMultiWrite to 1
>>      - then sqlite3FinishCoding calls  sqlite3VdbeMakeReady with 
>> pParse->isMultiWrite =1
>>        causing 3rd param of call to be TRUE.
>>      - This causes Vdbe.usesStmtJournal to be set to TRUE in 
>> sqliteVdbeMakeReady
>>      - Then sqlite3BtreeBeginStmt is called from sqlite3VdbeExec (case 
>> OP_Transaction:)
>>           here, p->iStatement is set to 1 because p->usesStmtJournal is 1
>>                54698     if( pOp->p2&&  p->usesStmtJournal<--- we go INTO 
>> this if statement
>>                54699&&  (db->autoCommit==0 || db->activeVdbeCnt>1)
>>                54700     ){
>>                54701       assert( sqlite3BtreeIsInTrans(u.as.pBt) );
>>                54702       if( p->iStatement==0 ){
>>                54703         assert( db->nStatement>=0&&  db->nSavepoint>=0 
>> );
>>                54704         db->nStatement++;<---- this sets to 1, causing 
>> next line to set iStatement to 1
>>                54705         p->iStatement = db->nSavepoint + db->nStatement;
>>                54706       }
>>                54707       rc = sqlite3BtreeBeginStmt(u.as.pBt, 
>> p->iStatement);
>>
>>      - sqlite3BtreeBeginStmt calls sqlite3PagerOpenSavepoint using 
>> iStatement as 2nd parameter,
>>         therefore nSavepoint is set to 1
>>
>>      - eventually we call sqlite3BtreeInsert which calls insertCell which 
>> calls sqlite3PagerWrite which
>>        calls pager_write a second time for the SAME pPg->pgno (see below for 
>> example page 6)
>>              - because we've gone through once for the same page, we do not 
>> enter the if clause
>>                on line 35464,
>>                35464     /* The transaction journal now exists and we have a 
>> RESERVED or an
>>                35465     ** EXCLUSIVE lock on the main database file.  Write 
>> the current page to
>>                35466     ** the transaction journal if it is not there 
>> already.
>>                35467     */
>>                35468     if( !pageInJournal(pPg)&&  isOpen(pPager->jfd) ){
>>
>>        - therefore we DO NOT add this page to the savepoint bitVec on line 
>> 35517
>>                35517         rc = sqlite3BitvecSet(pPager->pInJournal, 
>> pPg->pgno);
>>
>>        - therefore we DO go into the if(subjRequiresPage(pPg)), because
>>          nSavepoint is 1 but the bit is NOT set
>>                35536     /* If the statement journal is open and the page is 
>> not in it,
>>                35537     ** then write the current page to the statement 
>> journal.  Note that
>>                35538     ** the statement journal format differs from the 
>> standard journal format
>>                35539     ** in that it omits the checksums and the header.
>>                35540     */
>>                35541     if( subjRequiresPage(pPg) ){
>>                35542       rc = subjournalPage(pPg);
>>                35543     }
>>           from gdb...
>>              - (gdb) p *pPg
>>                    $17= {pPage = 0x2b84618, pData = 0x2b83618, pExtra = 
>> 0x2b84698, pDirty = 0x0, pPager = 0x2b666c8,
>>                      pgno = 6, flags = 6, nRef = 1, pCache = 0x2b667d0, 
>> pDirtyNext = 0x2b83540, pDirtyPrev = 0x0}
>>                (gdb) p  pageInJournal(pPg)
>>                    $18 = 1
>>                (gdb)  p 
>> sqlite3BitvecTest(pPager->aSavepoint[0]->pInSavepoint,6)
>>                    $19 = 0
>>
>>        - therefore we end up CALLING subjournalPage(pPg) from line 35542,
>>          causing  temp file to be created and lots of small writes to the 
>> file
>>
>>
>>
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

DK> _______________________________________________
DK> sqlite-users mailing list
DK> sqlite-users@sqlite.org
DK> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




-- 
Best regards,
 Teg                            mailto:t...@djii.com

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to