OK...here's a new thumb...data is now randomly distributed at 50%.   I'm 
running on Windows XP with version 3.7.5
And the between logic works about 36% faster.  Is your "rule of thumb" based on 
any benchmark or just a guess?

#include <stdio.h>
#include <stdlib.h>
int main()
{
  int i;
  char sql[4096];
  printf("CREATE TABLE x ('col1','col2','col3');\n");
  printf("BEGIN;");
  for(i=0;i<1000000;++i) {
    double r=rand()/(double)RAND_MAX;
    if (r < .5) {
      sprintf(sql,"INSERT INTO x values(null,'col2_%d','col3_%d');",i,i,i);
    }
    else {
      sprintf(sql,"INSERT INTO x values('col_%d','col2_%d','col3_%d');",i,i,i);
    }
    printf("%s\n",sql);
  }
  printf("COMMIT;");
  printf("CREATE INDEX col1index on x('col1');\n");
}

D:\SQLite\index1>sqlite3 test.db
SQLite version 3.7.5
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .read data.sql
sqlite> .timer on
sqlite> select count(*) from x where col1 between (select min(col1) from x) and 
(select max(col1) from x);
500080
CPU Time: user 0.218750 sys 0.046875
sqlite> select count(*) from x where col1 is not null;
500080
CPU Time: user 0.343750 sys 0.062500
sqlite> select count(*) from x where col1 is null;
499920
CPU Time: user 0.109375 sys 0.015625

Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Advanced GEOINT Solutions Operating Unit
Northrop Grumman Information Systems



From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Igor Tandetnik [itandet...@mvps.org]
Sent: Monday, October 17, 2011 10:06 AM
To: sqlite-users@sqlite.org
Subject: EXT :Re: [sqlite] "is not null" and index


On 10/17/2011 9:30 AM, Black, Michael (IS) wrote:
> According to this benchmark the break-even point is at 40% nulls.  I asssume 
> you have a different test?

I did mention "rule of thumb". Specific cases may vary. I must admit I'm
too lazy to build tests for someone else's problem.

The fact that all NULL values are clustered together in rows with
sequential rowids might have skewed the results in your test. Better
locality of reference, fewer pages to read from disk, improved cache
utilization.
--
Igor Tandetnik

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to