Re: [sqlite] Uncached performance

2015-01-08 Thread Teg
Hello Donald,


DG> 1) Might there be further performance gains by placing the blobs in a
DG> separate table?
DG> E.g.
DG> CREATE TABLE myBlobs (
DG> idINTEGER PRIMARY KEY REFERENCES global (id),
DG> value BLOB
DG> );
DG> Then (if you haven't rebuilt a new db, and perhaps only once ever) run
DG> VACUUM and ANALYZE.

This  is  how  I  do  it.  Seems  to  give  me the best performance. I
sometimes put them in their own DB file and then attach this file too.
I have DB's with 30+ gigs of blobs in them. 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Uncached performance

2015-01-08 Thread Donald Griggs
Hi, Maksim,

Others with more knowledge than I have given great advice regarding placing
your blob as the *last* table column, and about looking at cache size and
page size.

I wondered about three things:

1) Might there be further performance gains by placing the blobs in a
separate table?
E.g.
CREATE TABLE myBlobs (
idINTEGER PRIMARY KEY REFERENCES global (id),
value BLOB
);
Then (if you haven't rebuilt a new db, and perhaps only once ever) run
VACUUM and ANALYZE.

2) Any chance your app might be faster with your blobs in files instead of
in the db?  Is the following info useful?
 https://www.sqlite.org/intern-v-extern-blob.html


3) Are you truly certain the solution against which you compared sqlite did
not have the data in system cache?

Don't know if this is helpful,
Donald G.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Uncached performance

2015-01-08 Thread David King
> SS> Is there a difference between a key being present in your database with
> NULL value, and the key not being present ?
> Surely there is. But, actually, those 1 queries are made only to keys
> with a non-NULL value (usually less than 1 Kb, and maybe 20% are about 16
> kb).

*Probably* not related to your performance concerns (since as you say, 
searching for the key doesn't look to be your issue), but for this case you may 
consider a partial index https://www.sqlite.org/partialindex.html to reduce the 
number of seeks a little bit by reducing the depth of the index tree.


signature.asc
Description: Message signed with OpenPGP using GPGMail
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Uncached performance

2015-01-08 Thread Kees Nuyt
On Thu, 8 Jan 2015 15:04:28 +0500, ?? ??? 
wrote:

> CREATE TABLE global (
> [key] VARCHAR (1024),
> value BLOB,
> level INTEGER NOT NULL,
> original_name VARCHAR (1024),
> id INTEGER PRIMARY KEY AUTOINCREMENT,
> parent_id REFERENCES global (id)
> );

The order of columns looks suboptimal.
It's better to put smaller columns first, so all PK and key info is in
the first page, and is not pushed to an overflow page when the values of
text or blob columns have a biggish length().

Other remarks:
* VARCHAR() translates to TEXT in SQLite.
* parent_id missed a type definition.

Typically, your table would look like:

CREATE TABLE global (
id INTEGER PRIMARY KEY AUTOINCREMENT,
parent_id INTEGER REFERENCES global (id),
level INTEGER NOT NULL,
[key] TEXT,
original_name TEXT,
value BLOB
);

Other things to consider:
PRAGMA page_size=something_larger_than_default;
PRAGMA cache_size= .. ;

-- 
Regards,

Kees Nuyt

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Uncached performance

2015-01-08 Thread Максим Гумеров
Thanks for your responces!

SS> This suggests that you are filling up a cache, especially if your BLOBs
are large.  In terms of overall time, 1 queries in 2 seconds is 5000
queries a second.  If you are using a rotating hard disk then this is not
unexpected, given that your disk probably rotates at 5,400 rpm, that gives
an average latency of 5.55ms /per read

> On the first run, thought, 1 queries take
>about 30 seconds!
KM> Sounds normal as you are loading up the OS cache with data.

In principle, of course, it's how caching works :) But, like I wrote, there
is another engine and when using it (with non-SQLite database with the same
contents), I am able to perform that 1 queries a dozen times faster,
even on a 1st run. This means SQLite uses a non-optimal solution in this
case; well that's quite OK, 'cause that another engine was optimized for
this particular DB structure, while SQLite was not. So the question is
whether things in SQLite could be sped up by, for example, tweaking some
SQLite settings / fine-tuning DB structure / splitting a big database into
several smaller ones / etc.? If there are no brilliant ideas, then maybe
it's pure bad luck by which all the required keys are spread over the
entire file in SQLite DB, while in my other DB they are condensed in a
portion of the file so that reading just a portion of the file already
fills the cache with all the necessary data. I asked my question in
assumption (possibly wrong) that this is not the case, and that there is
another reason.

Of course, if those 1 queries were meant to read the entire DB, then of
course any way to do that would require to ultimately scan the entire file
- every possible DB engine would have to do that. But since we speak of
just 10'000 keys (out of 130'000), then ideally only a portion of the file
has to be read. So, if this is the reason of delays (and I am not sure it
is, but isn't it probable?), - if SQLite reads more data than it would be
ideally necessary, - is there a way to tweak something in such a way to
reduce that excessing file accesses?

KM>   In order for caching to work on a USB attached device, it must not be
mounted in "let the user pull it out at any time" mode (the default), or
what Microsoft calls "optimize for random user yanking", but must be
mounted as a connected device (called optimized for performance).
Well OK if you say so, thought I believed that is only true for
write-caching (which of course is dangerous if the device is removable),
while read-caching does not depend on whether the drive is removable or
not. Seemed logical to me, but might be wrong.

SS> Is there a difference between a key being present in your database with
NULL value, and the key not being present ?
Surely there is. But, actually, those 1 queries are made only to keys
with a non-NULL value (usually less than 1 Kb, and maybe 20% are about 16
kb).


2015-01-08 16:24 GMT+05:00 Keith Medcalf :

>
> >My database weights a little less than 2 Gbs and contains 130'000 keys.
> >When I put it on HDD and try to make 1 queries (extracting values for
> >1 different keys) with some additional processing of extracted
> >values, it takes about 4 seconds on my PC on any run except the first,
> >with maybe half of that time being the "additional processing" mentioned;
> >even when I perform every query 3 times in a row (making it 3
> queries),
> >this time does not change. On the first run, thought, 1 queries take
> >about 30 seconds!
>
> Sounds normal as you are loading up the OS cache with data.  Accessing
> data cached in RAM is much faster than accessing data on spinning disk.
> Have you set SQLite's page cache equal to a reasonable working set since
> access to application cached pages is even faster than access to the OS
> file cache -- and it avoids a jump into the kernel to do I/O which puts you
> at the mercy of the dispatcher.  Are you 100% I/O bound during the long
> (initial) run, and 100% CPU bound during the quicker runs?
>
> >When I put the file on a USB flash drive, somehow I always get about 45
> >seconds total time, on either first run and subsequent runs. When the
> >queries are tripled, the total time is tripled as well (even though the
> >portions of the file to be read should already be cached when every
> >enuqie query is repeated 2 more times).
>
> >This leads me to the conclusion that 1) The delays are produced by
> >physical reading of the file, not by searching for a key or returning
> >the value;
>
> >2) file on USB never gets cached (why?? due to some file-mode flags used
> by
> >SQLite engine? or it's just a MS Windows 8 issue?)
>
> This is an OS "feature".  In order for caching to work on a USB attached
> device, it must not be mounted in "let the user pull it out at any time"
> mode (the default), or what Microsoft calls "optimize for random user
> yanking", but must be mounted as a connected device (called optimized for
> performance).  You should also make sure you have i

Re: [sqlite] Uncached performance

2015-01-08 Thread Keith Medcalf

>My database weights a little less than 2 Gbs and contains 130'000 keys.
>When I put it on HDD and try to make 1 queries (extracting values for
>1 different keys) with some additional processing of extracted
>values, it takes about 4 seconds on my PC on any run except the first, 
>with maybe half of that time being the "additional processing" mentioned; 
>even when I perform every query 3 times in a row (making it 3 queries), 
>this time does not change. On the first run, thought, 1 queries take 
>about 30 seconds!

Sounds normal as you are loading up the OS cache with data.  Accessing data 
cached in RAM is much faster than accessing data on spinning disk.  Have you 
set SQLite's page cache equal to a reasonable working set since access to 
application cached pages is even faster than access to the OS file cache -- and 
it avoids a jump into the kernel to do I/O which puts you at the mercy of the 
dispatcher.  Are you 100% I/O bound during the long (initial) run, and 100% CPU 
bound during the quicker runs?

>When I put the file on a USB flash drive, somehow I always get about 45
>seconds total time, on either first run and subsequent runs. When the
>queries are tripled, the total time is tripled as well (even though the
>portions of the file to be read should already be cached when every
>enuqie query is repeated 2 more times).

>This leads me to the conclusion that 1) The delays are produced by
>physical reading of the file, not by searching for a key or returning 
>the value;

>2) file on USB never gets cached (why?? due to some file-mode flags used by
>SQLite engine? or it's just a MS Windows 8 issue?)

This is an OS "feature".  In order for caching to work on a USB attached 
device, it must not be mounted in "let the user pull it out at any time" mode 
(the default), or what Microsoft calls "optimize for random user yanking", but 
must be mounted as a connected device (called optimized for performance).  You 
should also make sure you have it formatted as NTFS and not as a FAT variant of 
some type.

Further, USB itself is very slow, both in transfer time and especially in 
turnaround time.  Flash drives are also very slow, usually because they do not 
need to operate at a speed faster than can be supported by the slow USB 
interface.  Is this a USB 1.0, 2.0, or 3.0 connection and device?





___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Uncached performance

2015-01-08 Thread Simon Slavin

On 8 Jan 2015, at 10:04am, Максим Гумеров  wrote:

> When I put it on HDD and try to make 1 queries (extracting values for
> 1 different keys) with some additional processing of extracted values,
> it takes about 4 seconds on my PC on any run except the first, with maybe
> half of that time being the "additional processing" mentioned; even when I
> perform every query 3 times in a row (making it 3 queries), this time
> does not change. On the first run, thought, 1 queries take about 30
> seconds!

This suggests that you are filling up a cache, especially if your BLOBs are 
large.  In terms of overall time, 1 queries in 2 seconds is 5000 queries a 
second.  If you are using a rotating hard disk then this is not unexpected, 
given that your disk probably rotates at 5,400 rpm, that gives an average 
latency of 5.55ms /per read/.

> CREATE TABLE global (
> 
>   [key] VARCHAR (1024),
>   value BLOB,
>   level INTEGER NOT NULL,
>   original_name VARCHAR (1024),
>   id INTEGER PRIMARY KEY AUTOINCREMENT,
>   parent_id REFERENCES global (id)
> 
> );

SQLite will interpret "VARCHAR (1024)" as TEXT and will not truncate.  I'm just 
warning you.

> value is a BLOB (and for 50% keys is just NULL)

Is there a difference between a key being present in your database with NULL 
value, and the key not being present ?

> And there are 3 separate indices: by level, key, and parent_id.

An index which is only on level would probably be pointless.  I suspect you're 
more likely to want to have that index include the key, too.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Uncached performance

2015-01-08 Thread Максим Гумеров
Hello!
Imagine I am trying to introduce a local key-value SQLite database for
caching some data retrieved from a remote server. Key is a character
string, value is a BLOB (and for 50% keys is just NULL). I will specify the
details later.

My database weights a little less than 2 Gbs and contains 130'000 keys.
When I put it on HDD and try to make 1 queries (extracting values for
1 different keys) with some additional processing of extracted values,
it takes about 4 seconds on my PC on any run except the first, with maybe
half of that time being the "additional processing" mentioned; even when I
perform every query 3 times in a row (making it 3 queries), this time
does not change. On the first run, thought, 1 queries take about 30
seconds!
When I put the file on a USB flash drive, somehow I always get about 45
seconds total time, on either first run and subsequent runs. When the
queries are tripled, the total time is tripled as well (even though the
portions of the file to be read should already be cached when every enuqie
query is repeated 2 more times).

This leads me to the conclusion that 1) The delays are produced by physical
reading of the file, not by searching for a key or returning the value; 2)
file on USB never gets cached (why?? due to some file-mode flags used by
SQLite engine? or it's just a MS Windows 8 issue?)

Now, does anybody know whether it's a known issue for SQLite that the
access is slow when the DB file is not yet cached by OS (I mean, it's
slower that one could expect: we have been previously using a hand-made
engine with which those 1 queries took just 4 seconds!)? Or maybe
that's an issue specific to something, like, string-based indexes, or this
large databases (approx. 2 Gb), or to something else?

The details:

CREATE TABLE global (

[key] VARCHAR (1024),

value BLOB,

level INTEGER NOT NULL,

original_name VARCHAR (1024),

id INTEGER PRIMARY KEY AUTOINCREMENT,

parent_id REFERENCES global (id)

);

And there are 3 separate indices: by level, key, and parent_id.

So the actual PK is integer, but most queries are like key=something.


I am opening the DB with

sqlite3_open_v2(fdatabase, fdb, 1{ SQLITE_OPEN_READONLY} , nil);

and querying it with

sqlite3_prepare(fdb, 'select value from global where key = ifnull(?,)',
-1, fReadValueQuery, ptail); + sqlite3_bind_text + sqlite3_step
+ sqlite3_column_blob


Maksim
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users