Did you do multiple runs and average? And how are you measuring time? Did you
do buffered reads?
And you're MB/sec isn't accurate for the count() test. I think it was already
said that count() doesn't read the entire record.
The SQLite numbers are due to SQLite caching. If you make page size 4096 you
are 4 times less likely to need another disk read (SSD or otherwise). That's
different from what you're doing. You're one layer down from that.
Your read makes perfect sense if I look at it as sampling error....which is
what I would expect.
As you increase your read page size you get a curve that saturates at 50MB/sec.
Also...you either have a really slow machine or your timing is wrong.
I can do over 70MB/sec just reading from disk (2.6Ghz Xeon, 7200 SAS drives
mirrored)
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
double tic(int flag)
{
static double t1;
double t2;
struct timeval tv;
static long base_seconds;
gettimeofday(&tv,NULL);
// to give us the resolution we need for high-speed we will subtract our
starting seconds
// otherwise 15 digits of accuracy aren't enough
if (base_seconds==0) base_seconds=tv.tv_sec;
if (flag==0) {
t1 = (tv.tv_sec-base_seconds)+tv.tv_usec/(double)1000000;
}
t2 = (tv.tv_sec-base_seconds)+tv.tv_usec/(double)1000000;
return t2-t1;
}
void test1(int bufsize)
{
char *buf=malloc(bufsize);
FILE *fp=fopen("test2.dat","r");
int n,nn=0;
while((n=fread(buf,1,bufsize,fp))) {
nn+=n;
}
fclose(fp);
printf("%d\n",nn);
}
int main(int argc,char *argv[])
{
tic(0);
test1(atoi(argv[1]));
printf("%f MB/sec\n",512000000/tic(1));
return 0;
}
dd if=/dev/zero of=test2.dat bs=1024 count=500000
sync; echo 3 > /proc/sys/vm/drop_caches
io 4096
512000000
73.071591 MB/sec
Now that it's cached
io 4096
512000000
1884.153971 MB/sec
I would hope your SSD can do better than my disk on raw throughput.
Michael D. Black
Senior Scientist
NG Information Systems
Advanced Analytics Directorate
________________________________________
From: [email protected] [[email protected]] on
behalf of Max Vlasov [[email protected]]
Sent: Monday, February 28, 2011 8:52 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] EXT :Re: COUNT() extremely slow first time!
Michael,
it's an interesting hypothesis. Maybe you can describe the following
results. Recently I made a test program that just reads a file sequentially
with different block sizes and results are still confusing. This was the
same SSD drive and the functions were basically from api:
CreateFile/ReadFile
128, 256: 5MB/Sec
512: 20MB/Sec
1024: 25MB/Sec.
2048: 41MB/Sec
4096 - 65536: ~50MB/Sec.
Strangely I didn't see such a big difference between slowest and fastest
scenario (if we exclude the sizes below 1024), as you see the difference is
only x2. I have only one explanation looking at the numbers. Although it's
well-known that sqlite reads only full pages, if it sometimes does partial
reading, this 5MB/Sec drop for <256 reading can affect linear speed of 25
MB/Sec to end up as 12MB/Sec. But it's just a guess.
Max
On Mon, Feb 28, 2011 at 4:43 PM, Black, Michael (IS) <[email protected]
> wrote:
> Those numbers make sense to me.
>
> Since count is doing a linear walk through....correct?
>
> #1 SSDs more than likely don't do "read ahead" like a disk drive. So what
> you're seeing is what read-ahead does for you.
> #2 Count is doing a linear scan of the table...Probability of 2K containing
> the next page after 1K -- 100% (2X performance improvment)
> #3 Probability of 4K containing the next page after 2K -- 100% (2X
> improvement).
> #4 Probability of 8K containing the next page after 4K -- here the
> probability either drops or we're starting to hit the bandwidth of
> SSD+Sqlite -- I'm tempted to say that it's not the probability that drops.
>
> ________________________________________
> From: [email protected] [[email protected]] on
> behalf of Max Vlasov [[email protected]]
>
> my timing for Select Count for 400-500MB Bases and the table with about
> 1,000,000 records
>
> 1024:
> 40 Seconds, 98% system+user time, 500 MB Read, 12 MB/Sec
> 2048:
> 22 Seconds, 94% system+user time, 500 MB Read, 20 MB/Sec
> 4096 (This NTFS Cluster size)
> 11 Seconds, 96% system+user time, 450 MB Read, 32 MB/Sec
> 8192
> 8 Seconds, 87% system+user time, 420 MB Read, 50 MB/Sec
> 32768
> 8 Seconds, 56% system+user time, 410 MB Read, 50 MB/Sec
>
> Max
>
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users