Version 3.5.3 made a change to the way DISTINCT is processed.
Probably that change is making your particular case much slower. The
change can be seen at:
http://www.sqlite.org/cvstrac/chngview?cn=4538
This change was in response to grumbling on the mailing list
http://www.mail-archive.com/[email protected]/msg28894.html
It would appear that I need to spend some time improving this
optimization - enabling it only in cases where it seems likely to
improve performance and disabling it in cases like yours were it makes
things much worse. We'll try to have a look at that for version 3.6.9.
On Jan 5, 2009, at 7:13 PM, Jason Freund wrote:
> Dan,
>
> Thanks, I sent a separate email containing the source and data for
> my test app since it's too big (5M) to post to the group. But I'm
> also posting just the code for the test app here, in case anyone
> wants to take a look. On my target hardware, the results I got for
> the most egregious regression (doing prepare and step on a
> selection) were:
>
> 3.4.0 3.6.5
> 100 records 4ms 20ms
> 500 records 1.4ms 89ms
> 1000 records 1.5ms 173ms
>
> If anyone else is interested in seeing the data to try to reproduce
> and track the problem, please let me know, and I'll send you the
> full zip file.
>
> Thanks,
> Jason
>
>
> ===
> /
> ******************************************************************************
> * This app prepares a statement and steps 50 times in a database of
> 100,
> * 500 or 1000 records, using a SQL statement that seems to demonstrate
> * a performance regression between 3.4.0 and 3.6.5
> *
> * The raw results we got when running this test on some cellphone
> reference
> * hardware running an ARM5/~600Mhz chip and NAND storage are:
> * 3.4.0 3.6.5
> * N=100, Prepare, StepX50 4ms 20ms
> * N=500, Prepare, StepX50 1.4ms 89
> * N=1000 Prepare, StepX50 1.5ms 173ms
> *
> * While this test demonstrates the worst degradation: constant
> performance
> * for varying data sizes with 3.4.0 and linear performance for
> 3.6.5, there
> * were several other test cases that were linear on both 340 and
> 365, but
> * were consistently 30% slower for 365. These other cases included a
> * different query, and a couple of INSERT statements, but we didn't
> * include those results in this test.
> *
> * Any advice or insight on this case would be appreciated:
> * [email protected]
> *****************************************************************************/
>
> #include <string.h>
> #include <stdio.h>
> #include <sqlite3.h>
> #include <wchar.h>
> #include <glib.h>
> #include <sys/time.h>
>
> /* Return delta between oldTime - currentTime in millisecons */
> float TimeInterval(struct timeval *oldTvP)
> {
> struct timeval tv;
> gettimeofday(&tv, NULL);
>
> if(oldTvP == NULL)
> return(0);
>
> tv.tv_sec -= oldTvP->tv_sec;
> if (tv.tv_usec < oldTvP->tv_usec) {
> tv.tv_usec += 1000000 - oldTvP->tv_usec;
> tv.tv_sec--;
> } else {
> tv.tv_usec -= oldTvP->tv_usec;
> }
>
> return (((float)tv.tv_sec*1000) + tv.tv_usec/1000.0f);
> }
>
> static int my_utf8_collator(void* user_data, int Length1,const void*
> Start1,
> int Length2, const void* Start2)
> {
> int Result = 0;
>
> gchar* Char1 = (gchar*) Start1;
> gchar* Stop1 = Char1 + Length1;
> wchar_t Wide1[2] = {0, 0};
>
> gchar* Char2 = (gchar*) Start2;
> gchar* Stop2 = Char2 + Length2;
> wchar_t Wide2[2] = {0, 0};
>
> while (0 == Result && Char1 < Stop1 && Char2 < Stop2)
> {
> Wide1[0] = (wchar_t) g_utf8_get_char(Char1);
> Wide2[0] = (wchar_t) g_utf8_get_char(Char2);
>
> Result = wcscoll(Wide1, Wide2);
>
> if (0 == Result)
> {
> Char1 = g_utf8_next_char(Char1);
> Char2 = g_utf8_next_char(Char2);
> }
> }
>
> if (Result == 0) {
> Result = Length1 - Length2;
> }
>
> return Result;
> }
>
> void SelectionTest(char *dbName, int category)
> {
> sqlite3 *dbH = NULL;
> struct timeval oldTv;
> float totalTime=0.0;
> const unsigned char *junkStringP = NULL;
> char *dummy = NULL, sqlStrP[512];
> int sqlErr, loopCount=0, junk;
> sqlite3_stmt *stmtP = NULL;
> char *selectSql = "SELECT DISTINCT PERSON.PersonLUID, FirstName,
> LastName, Company, EnglishName, Note NOTNULL, SIMPhoneBook,
> DisplayNameLast FROM PERSON, _CATEGORY_MEMBERSHIP WHERE
> _CATEGORY_MEMBERSHIP.EntityLUID = PERSON.PersonLUID AND
> _CATEGORY_MEMBERSHIP.CategoryLUID = %d ORDER BY LastName, FirstName,
> Company, SIMPhoneBook, PERSON.PersonLUID COLLATE my_utf8_collator
> LIMIT 100 OFFSET 0;";
> char *selectSql2 = "SELECT DISTINCT PERSON.PersonLUID, FirstName,
> LastName, Company, EnglishName, Note NOTNULL, SIMPhoneBook,
> DisplayNameLast FROM PERSON ORDER BY LastName, FirstName, Company,
> SIMPhoneBook, PERSON.PersonLUID COLLATE my_utf8_collator LIMIT 100
> OFFSET 0;";
>
> /* Open SQLite database connection */
> sqlErr = sqlite3_open(dbName, &dbH);
> if (sqlErr) {
> printf("sqlite3_open(%s) = 0x%x\n", dbName, sqlErr);
> }
>
> /* Install user UTF-8 collate function */
> sqlErr = sqlite3_create_collation(
> dbH, /* (sqlite3 *) */
> "my_utf8_collator", /* (const char *)zName */
> SQLITE_UTF8, /* int enc */
> NULL, /* (void *)pCtx */
> &my_utf8_collator /* int(xCompare) */
> );
> if (sqlErr) {
> printf("sqlite3_create_collation = 0x%x\n", sqlErr);
> }
>
> /* Build first SQL statement, with JOIN */
> sprintf(sqlStrP, selectSql, category);
>
> /* Start timer, prepare statement and step up to 50 times */
> gettimeofday(&oldTv, NULL);
> sqlErr = sqlite3_prepare(dbH, sqlStrP, strlen(sqlStrP),
> &stmtP, (const char **) &dummy);
>
> while (((sqlErr = sqlite3_step(stmtP)) == SQLITE_ROW) && loopCount
> < 50) {
> /* Include a few column fetches from each row in the timing */
> junk = sqlite3_column_int(stmtP, 0);
> junkStringP = sqlite3_column_text(stmtP, 7);
> junk = sqlite3_column_int(stmtP, 6);
> junk = sqlite3_column_int(stmtP, 5);
>
> loopCount++;
> }
>
> /* Finalize statement, stop timer, and report results */
> sqlErr = sqlite3_finalize(stmtP);
> totalTime = TimeInterval (&oldTv);
> printf("A) Create select statement with Category and iterate %d "
> "times = %3.3f ms\n", loopCount, totalTime);
>
> /* Start timer, prepare statement and step up to 50 times */
> gettimeofday(&oldTv, NULL);
> sqlErr = sqlite3_prepare(dbH, selectSql2, strlen(selectSql2),
> &stmtP, (const char **) &dummy);
>
> while (((sqlErr = sqlite3_step(stmtP)) == SQLITE_ROW) && loopCount
> < 50) {
> /* Include a few column fetches from each row in the timing */
> junk = sqlite3_column_int(stmtP, 0);
> junkStringP = sqlite3_column_text(stmtP, 7);
> junk = sqlite3_column_int(stmtP, 6);
> junk = sqlite3_column_int(stmtP, 5);
>
> loopCount++;
> }
>
> /* Finalize statement, stop timer, and report results */
> sqlErr = sqlite3_finalize(stmtP);
> totalTime = TimeInterval (&oldTv);
> printf("B) Create select statement (no Category) and iterate %d "
> "times = %3.3f ms\n\n", loopCount, totalTime);
>
> /* Close DB */
> sqlErr = sqlite3_close(dbH);
> }
>
> /* Run 3 queries and report timings */
> int main(int argc, char *argv[])
> {
> const char *versionP;
>
> versionP = sqlite3_libversion();
> printf("SQLite %s\n", versionP);
>
> SelectionTest("contacts_100.db", -1073743423);
> SelectionTest("contacts_500.db", -1073743576);
> SelectionTest("contacts_1000.db", -1073741927);
>
> return 0;
> }
>
> /
> *Schema
> **********************************************************************
> BEGIN TRANSACTION;
>
> CREATE TABLE PERSON(
> PersonLUID integer not null primary key,
> FirstName text default null collate nocase,
> MiddleName text default null,
> LastName text default null collate nocase,
> Title text default null,
> Company text default null collate nocase,
> Prefix text default null,
> Suffix text default null,
> NickName text default null,
> EnglishName text default null,
> YomiFirstName text default null,
> YomiLastName text default null,
> YomiCompany text default null,
> Ringtone text default null,
> VideoRingtone text default null,
> Note text default null,
>
> FriendlyName text default null,
> DisplayNameLast text default null,
> DisplayNameCompany text default null,
> DisplayNameFirst text default null,
> TwoCharLookup text default null,
>
> Birthday integer default null,
> BirthdayAlarm integer default null,
> Anniversary integer default null,
> AnniversaryAlarm integer default null,
>
> UseCount integer default 0,
> DontBeamNote boolean default false,
> IsBusinessCard boolean default false,
>
> SIMPhoneBook integer default 0,
> SIMEntryID integer default 0
> );
>
> CREATE INDEX FirstNameIndex ON PERSON(FirstName);
> CREATE INDEX LastNameIndex ON PERSON(LastName);
> CREATE INDEX CompanyIndex ON PERSON(Company);
> CREATE INDEX UseCountIndex ON PERSON(UseCount);
> CREATE INDEX TwoCharLookupIndex ON PERSON(TwoCharLookup);
> CREATE INDEX LastFirstCompanyIndex ON
> PERSON(LastName,FirstName,Company,SIMPhoneBook);
> CREATE INDEX FirstLastCompanyIndex ON
> PERSON(FirstName,LastName,Company,SIMPhoneBook);
> CREATE INDEX CompanyLastFirstIndex ON
> PERSON(Company,LastName,FirstName,SIMPhoneBook);
>
> CREATE TABLE ADDRESS(
> AddressLUID integer primary key default 1,
> PersonLUID integer not null,
> Identifier text default null,
> Address text default null,
> City text default null,
> State text default null,
> PostalCode text default null,
> Country text default null,
> VCardSrc text default null
> );
>
> CREATE INDEX AddressPersonLUIDIndex ON ADDRESS(PersonLUID);
>
> CREATE TABLE EXTRAFIELDS(
> ExtraFieldLUID integer primary key default 1,
> PersonLUID integer not null,
> Identifier text default null,
> Value text default null,
> VCardSrc text default null
> );
>
> CREATE INDEX ExtraFieldPersonLUIDIndex ON EXTRAFIELDS(PersonLUID);
>
> CREATE TABLE CONTACTS(
> ContactLUID integer primary key default 1,
> PersonLUID integer not null,
> Identifier text default null,
> Value text default null,
> ContactType text default null,
> UseCount integer default 0,
> DefaultContact boolean default false,
> SpeedDial integer default 0,
> VCardSrc text default null
> );
>
> CREATE INDEX ContactPersonLUIDIndex ON CONTACTS(PersonLUID);
>
> CREATE TABLE THUMBNAIL(
> ThumbnailLUID integer primary key default 1,
> PersonLUID integer not null,
> SmallThumbnail blob not null,
> LargeThumbnail blob not null
> );
>
> CREATE INDEX ThumbnailPersonLUIDIndex ON THUMBNAIL(PersonLUID);
>
> CREATE TABLE CUSTOMLABELS(
> CustomLabelLUID integer primary key default 1,
> Label text not null,
> Identifier text not null
> );
>
> CREATE TABLE VERSION(
> Version integer primary key default 1
> );
>
> INSERT INTO VERSION VALUES(3);
>
> CREATE TABLE LOADEDDEFAULTS(
> HasDefaults integer primary key default 0
> );
>
> CREATE TABLE CATEGORY_RINGTONES(
> CategoryLUID integer not null,
> Ringtone text default null,
> RingtoneType integer default 0
> );
>
> CREATE TABLE RECORD_COUNT_CACHE(
> CategoryLUID integer not null primary key,
> RecordCount integer default null
> );
>
> CREATE TABLE DISTRIBUTION(
> DistributionLUID integer primary key default 1,
> Value text not null,
> DistributionType text not null
> );
>
> CREATE TABLE DISTRIBUTION_MEMBERSHIP(
> MembershipLUID integer primary key default 1,
> DistributionLUID integer not null,
> PersonLUID integer not null,
> Identifier text not null
> );
>
> CREATE INDEX DistMembershipIndex ON
> DISTRIBUTION_MEMBERSHIP(DistributionLUID,PersonLUID,Identifier);
>
> COMMIT;
>
> BEGIN TRANSACTION;
>
> CREATE TABLE _CATEGORY(
> CategoryLUID integer not null primary key default 1,
> Name text not null,
> ReadOnly int not null
> );
>
> CREATE TABLE _CATEGORY_MEMBERSHIP(
> CategoryMemberLUID integer not null primary key default 1,
> CategoryLUID integer not null,
> EntityLUID integer not null
> );
>
> COMMIT;";
> *****************************************************************************/
>
>
> On Dec 25, 2008, at 5:01 AM, Jason Freund wrote:
>
>> Hi,
>>
>>
>>
>> We recently upgraded from 3.4.0 to 3.6.5, and noticed that our
>> little,
>> internal sqlite benchmark test program performed much worse on most
>> operations with the new version. We expected to see an improvement
>> because 3.6.1 advertises a performance enhancement with the lookaside
>> buffer.
>>
>>
>>
>> Our small performance test app showed regressions averaging 25%
>> slowdown
>> for all read and write operations, but one test case was very
>> striking:
>> going from 2ms in 3.4.0 to 170ms in 3.6.5. This test prepared a
>> "SELECT
>> DISTINCT from one table with an ORDER BY, COLLATE, and LIMIT" and
>> then
>> stepped 50 times through that statement.
>>
>>
>>
>> I was wondering if anyone else encountered a regression, and
>> whether I
>> can interest someone from the dev community in looking into it, or
>> whether I should file a bug? Also, I may be able to do more builds
>> to
>> help isolate when the regressions were introduced, but I'm not sure
>> if
>> there is an easy way to download code for earlier releases? I
>> haven't
>> used cvs before - is there an easy way to checkout the code for some
>> kind of "label" that represents an official release, or is there a
>> depot
>> somewhere that I can just download earlier distros?
>
> If you post the performance test program I'll take a look at it.
>
> Dan.
>
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
D. Richard Hipp
[email protected]
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users