Re: [sqlite] Effect of blobs on performance

2007-02-22 Thread drh
Teg <[EMAIL PROTECTED]> wrote:
> Hello drh,
> 
> You tested under Windows with synchronous=OFF? I mean specifically
> that way. I've never seen Sqlite trash a DB when I had synchronous
> turned on even with app crashes.
> 

An application crash should not corrupt the DB even with
synchronous=OFF.  But with synchronous=OFF, an OS crash or
a power failure might corrupt the DB.  So if you feel like
your os is stable (windows never crashes?) and you have a
UPS or something to prevent untimely power loss, then it
is probably safe to set synchronous=OFF.

--
D. Richard Hipp  <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-22 Thread P Kishor

see below for a counter perspective

On 2/22/07, John Stanton <[EMAIL PROTECTED]> wrote:

Thomas Fjellstrom wrote:
> On February 21, 2007, [EMAIL PROTECTED] wrote:
>
>>"Brett Keating" <[EMAIL PROTECTED]> wrote:
>>
>>>Hi,
>>>
>>>I'm curious about what the effect of having a blob in the database may
>>>be on performance. I have two design options: 1) put a small image file
>>>(15-30kbyte) into the database as a blob, and 2) store the image in a
>>>separate file on disk and hold the filename in the database. My table
>>>has around 20 rows in it, about half are strings/smaller blobs and half
>>>are integers.
>>>
>>>Option number one, for various reasons, is far more elegant and simple
>>>in terms of its impact on the rest of the code. However, I am concerned
>>>that holding such large amounts of data per record might impact
>>>performance. I could be worried about nothing though, which is why I'm
>>>writing to this list :).
>>
>>When I was designing the SQLite file format, I made the assumption
>>that BLOBs would be used infrequently and would not be very big.
>>The file format design is not optimized for storing BLOBs.  Indeed,
>>BLOBs are stored as a singly-linked list of database pages.  It
>>is hard to imagine a more inefficient design.
>>
>>Much to my surprise, people begin putting multi-megabyte BLOBs
>>in SQLite databases and reporting that performance really was not
>>an issue.  I have lately taken up this practice myself and routinely
>>uses SQLite database with BLOBs that are over 10MiB is size.  And
>>it all seems to work pretty well here on my Linux workstation.  I
>>have no explanation for why it works so well, but it does so I'm not
>>going to complain.
>>
>>If your images are only 30KiB, you should have no problems.
>>
>>Here's a hint though - make the BLOB columns the last column in
>>your tables.  Or even store the BLOBs in a separate table which
>>only has two columns: an integer primary key and the blob itself,
>>and then access the BLOB content using a join if you need to.
>>If you put various small integer fields after the BLOB, then
>>SQLite has to scan through the entire BLOB content (following
>>the linked list of disk pages) to get to the integer fields at
>>the end, and that definitely can slow you down.
>
>
> I still wonder about the utility of storing binary data in the db itself.
> Maybe it makes it more easy to distribute that way, but how often does one
> distribute an entire database in a "vendor specific" format?
>
> I'm quite interested in hearing people's reasoning for going the blob route,
> when you have a perfectly good "database" format for "blobs" already (various
> filesystems).

The BLOB method has two major advantages.  The first is that the data
all reside in one file.  The second is that it does not have a directory
size limit, a curse when storing files and one which requires extra
logic to build a tree structure of directories or some other way of
limiting directory size.

The downside is the relative slowness of retrieving large BLOBs,
although the anecdotal evidence is that this is not as much of a problem
as one would expect.

On balance I would use BLOBs for binary storage except in the case where
there is a predominance of very large files.
>


2 examples --

I had my website as a SQLite db. Since I am using a cheapo hosting
site, I was at the mercy of what they had. At some point, I update my
local version (on my laptop) to the then latest version of SQLite.
That was one of those inflection points at which the SQLite db format
was backward incompatible. The web host was not going to upgrade their
DBD::SQLite, and it was a lot of extra work for me to create my own
libs and use my my own DBD installs. I changed the backend to plain
text files that are stored in a file folder hierarchy like so
?/??/???/file (where ? are the first, the first two, and the first
three letters of the file name). The beauty of this approach is that I
can log in to my website remotely and just use vim to change
individual pages without having to do that from SQLite. I still use
SQLite for metadata.

Second example is for a destop application -- I was looking for an
application in which I could write disjointed thoughts and research
findings. One of the applications (I forget its name), really nice
app, used Coredata on Mac OS X Tiger as its storage. Yes, I could get
to the data, but if I ever stopped using that app, it would be a pain
to get my data out. I changed to another program (Scrivener) which
stores my documents as separate RTF files inside an OS X package. From
the outside, the package looks like a file, but right click on it, and
you can peek inside to find all the separate RTF documents. I can do
what I want to with those documents with TextEdit.

The above examples are not against the use of BLOBs. They are just
examples of where NOT storing BLOBs in a db is a strength. I know that
SQLite is very stable, and a lot of care has been taken to make it
corruption proof, 

Re[2]: [sqlite] Effect of blobs on performance

2007-02-22 Thread Teg
Hello drh,

You tested under Windows with synchronous=OFF? I mean specifically
that way. I've never seen Sqlite trash a DB when I had synchronous
turned on even with app crashes.

C


Thursday, February 22, 2007, 9:56:57 AM, you wrote:

dhc> Teg <[EMAIL PROTECTED]> wrote:
>> 
>> I'm pretty sure an application crash even without power failure can
>> corrupt to. At least in my experience. I keep synchronous on and
>> simply use "insert or ignore" syntax within a transaction to get
>> performance.
>> 

dhc> It is not suppose to.  There are extensive tests in the test
dhc> suite where we simulate application crashes and verify that the
dhc> database is not corrupted.  If you encounter a situation where
dhc> the database is corrupted, that is a bug and you should report
dhc> it.

dhc> --
dhc> D. Richard Hipp  <[EMAIL PROTECTED]>


dhc> 
-
dhc> To unsubscribe, send email to [EMAIL PROTECTED]
dhc> 
-




-- 
Best regards,
 Tegmailto:[EMAIL PROTECTED]


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-22 Thread John Stanton

Thomas Fjellstrom wrote:

On February 21, 2007, [EMAIL PROTECTED] wrote:


"Brett Keating" <[EMAIL PROTECTED]> wrote:


Hi,

I'm curious about what the effect of having a blob in the database may
be on performance. I have two design options: 1) put a small image file
(15-30kbyte) into the database as a blob, and 2) store the image in a
separate file on disk and hold the filename in the database. My table
has around 20 rows in it, about half are strings/smaller blobs and half
are integers.

Option number one, for various reasons, is far more elegant and simple
in terms of its impact on the rest of the code. However, I am concerned
that holding such large amounts of data per record might impact
performance. I could be worried about nothing though, which is why I'm
writing to this list :).


When I was designing the SQLite file format, I made the assumption
that BLOBs would be used infrequently and would not be very big.
The file format design is not optimized for storing BLOBs.  Indeed,
BLOBs are stored as a singly-linked list of database pages.  It
is hard to imagine a more inefficient design.

Much to my surprise, people begin putting multi-megabyte BLOBs
in SQLite databases and reporting that performance really was not
an issue.  I have lately taken up this practice myself and routinely
uses SQLite database with BLOBs that are over 10MiB is size.  And
it all seems to work pretty well here on my Linux workstation.  I
have no explanation for why it works so well, but it does so I'm not
going to complain.

If your images are only 30KiB, you should have no problems.

Here's a hint though - make the BLOB columns the last column in
your tables.  Or even store the BLOBs in a separate table which
only has two columns: an integer primary key and the blob itself,
and then access the BLOB content using a join if you need to.
If you put various small integer fields after the BLOB, then
SQLite has to scan through the entire BLOB content (following
the linked list of disk pages) to get to the integer fields at
the end, and that definitely can slow you down.



I still wonder about the utility of storing binary data in the db itself. 
Maybe it makes it more easy to distribute that way, but how often does one 
distribute an entire database in a "vendor specific" format?


I'm quite interested in hearing people's reasoning for going the blob route, 
when you have a perfectly good "database" format for "blobs" already (various 
filesystems).


The BLOB method has two major advantages.  The first is that the data 
all reside in one file.  The second is that it does not have a directory 
size limit, a curse when storing files and one which requires extra 
logic to build a tree structure of directories or some other way of 
limiting directory size.


The downside is the relative slowness of retrieving large BLOBs, 
although the anecdotal evidence is that this is not as much of a problem 
as one would expect.


On balance I would use BLOBs for binary storage except in the case where 
there is a predominance of very large files.




--
D. Richard Hipp  <[EMAIL PROTECTED]>


---
-- To unsubscribe, send email to [EMAIL PROTECTED]
---
--








-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-22 Thread Dennis Jenkins

[EMAIL PROTECTED] wrote:

Dennis Jenkins <[EMAIL PROTECTED]> wrote:
  
these are the settings that our app uses when it creates/opens the sqlite 
database:


db.ExecuteImmediate("PRAGMA synchronous=OFF");



With synchronous=OFF, a power failure might result in database
corruption.  Is this an issue for you?
--
D. Richard Hipp  <[EMAIL PROTECTED]>
  


Not really.  The data can be regenerated by the user without too much 
difficulty.  I don't have the numbers handy, but I seem to remember that 
I did some performance experiments and determined that the performance 
gains significantly outweighed the potential problems.  To my knowledge, 
for the past two years, only one user (out of many hundreds) has ever 
gotten a corrupt database.


I suppose I'll put in a ticket into our issue tracking system to review 
this decision.  According to a comment in our source code, I based this 
action on 
http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html#pragma-synchronous




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-22 Thread drh
Teg <[EMAIL PROTECTED]> wrote:
> 
> I'm pretty sure an application crash even without power failure can
> corrupt to. At least in my experience. I keep synchronous on and
> simply use "insert or ignore" syntax within a transaction to get
> performance.
> 

It is not suppose to.  There are extensive tests in the test
suite where we simulate application crashes and verify that the
database is not corrupted.  If you encounter a situation where
the database is corrupted, that is a bug and you should report
it.

--
D. Richard Hipp  <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re[2]: [sqlite] Effect of blobs on performance

2007-02-22 Thread Teg
Hello drh,

Thursday, February 22, 2007, 9:08:08 AM, you wrote:

dhc> Dennis Jenkins <[EMAIL PROTECTED]> wrote:
>> 
>> these are the settings that our app uses when it creates/opens the sqlite 
>> database:
>> 
>> db.ExecuteImmediate("PRAGMA synchronous=OFF");

dhc> With synchronous=OFF, a power failure might result in database
dhc> corruption.  Is this an issue for you?
dhc> --
dhc> D. Richard Hipp  <[EMAIL PROTECTED]>


dhc> 
-
dhc> To unsubscribe, send email to [EMAIL PROTECTED]
dhc> 
-

I'm pretty sure an application crash even without power failure can
corrupt to. At least in my experience. I keep synchronous on and
simply use "insert or ignore" syntax within a transaction to get
performance.



-- 
Best regards,
 Tegmailto:[EMAIL PROTECTED]


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-22 Thread Denis Sbragion
Hello Dan,

On Thu, February 22, 2007 06:08, Dan Kennedy wrote:
> * Can include blob operations as part of atomic transactions.

me too. Transactions are a major advantage of database blobs.

Bye,

-- 
Denis Sbragion
InfoTecna
Tel: +39 0362 805396, Fax: +39 0362 805404
URL: http://www.infotecna.it






-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-22 Thread drh
Dennis Jenkins <[EMAIL PROTECTED]> wrote:
> 
> these are the settings that our app uses when it creates/opens the sqlite 
> database:
> 
> db.ExecuteImmediate("PRAGMA synchronous=OFF");

With synchronous=OFF, a power failure might result in database
corruption.  Is this an issue for you?
--
D. Richard Hipp  <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-22 Thread Dennis Jenkins

Thomas Fjellstrom wrote:
I still wonder about the utility of storing binary data in the db itself. 
Maybe it makes it more easy to distribute that way, but how often does one 
distribute an entire database in a "vendor specific" format?


I'm quite interested in hearing people's reasoning for going the blob route, 
when you have a perfectly good "database" format for "blobs" already (various 
filesystems).
  


1)
We use the sqlite encryption extension.  We want our blobs encrypted as 
well as our database, so putting the blobs into the database makes 
sense.  We have a special table for the blobs that has a primary key (3 
columns) and the blob.  The rest of the data is contained in other tables.


2)
We don't need to worry about atomically deleting disk blobs and database 
rows.  We take advantage of the ACID nature of sqlite.  This way we 
don't have to code for contingencies where the user has managed to 
delete or corrupt a blob, or a blob that our app can't delete even when 
it deletes the database row.


3)
Having everything in one package.  Makes tech support much easier if the 
user only has to transmit a single file instead of an entire directory.


4)
We modify the blobs at runtime.  ACIDness of sqlite is very nice here.  
I don't want to try to re-implement this directly on the filesystem 
(even if it becomes a simple rename operation).



Our blobs vary in size from 12K to 3M.  Sqlite is not a performance 
bottleneck for us... the client's internet connection is.


I have not done extensive performance tests on these settings, but these 
are the settings that our app uses when it creates/opens the sqlite 
database:


   db.ExecuteImmediate("PRAGMA page_size=4096");
   db.ExecuteImmediate("PRAGMA legacy_file_format=ON");
   db.ExecuteImmediate("PRAGMA cache_size=8000");
   db.ExecuteImmediate("PRAGMA synchronous=OFF");
   db.ExecuteImmediate("PRAGMA temp_store=2");


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re[2]: [sqlite] Effect of blobs on performance

2007-02-22 Thread Teg
Hello drh,

Thursday, February 22, 2007, 7:54:58 AM, you wrote:


dhc> A common use for SQLite is as an application file format.  When
you do File->>Open, instead of reading and parsing a bunch of
dhc> information in a proprietary format, just call sqlite3_open()
dhc> on the file instead.  File->Save becomes a no-op.  There is no
dhc> parser to write.  Performance problems can often be fixed simply
dhc> by adding another index.  All of your data is written to disk 
dhc> atomically and is relatively safe from corruption even if you 
dhc> turn off the power during a write.  There are a lot of advantages
dhc> to this approach.

dhc> Lots of people are starting to use SQLite as an application
dhc> file format.  Remember the SQLite slogan:

dhc> SQLite is not a replacement for Oracle, it is a
dhc> replacement for fopen().

>> 
>> I'm quite interested in hearing people's reasoning for going the blob route, 
>> when you have a perfectly good "database" format for "blobs" already 
>> (various 
>> filesystems).
>> 

dhc> Just yesterday, Eric Scouten posted on this list that he had done
dhc> a study and found that for BLOB smaller than 20-30K it was faster
dhc> to store them in an SQLite database than on disk.
dhc> --
dhc> D. Richard Hipp  <[EMAIL PROTECTED]>

I was reading a similar study that stated that 150K was the knee point
in general for DB blob performance. I store 1000's of JPG images in
SQLite as blobs, ranging in size up to 4 megs or so each (though the
typical size is in the 200-300K range). My only issue with this method
is that enumeration seems slower than enumerating the same files in a
folder. I store them in DB's because they're nice units of data I can
backup and move around anywhere without having worry about losing the
connection between the DB and the files themselves. Because they're
images, I only ever want to read them all at once so, not having
random access within the blobs is unimportant to me.



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-22 Thread drh
Thomas Fjellstrom <[EMAIL PROTECTED]> wrote:
> 
> I still wonder about the utility of storing binary data in the db itself. 
> Maybe it makes it more easy to distribute that way, but how often does one 
> distribute an entire database in a "vendor specific" format?

A common use for SQLite is as an application file format.  When
you do File->Open, instead of reading and parsing a bunch of
information in a proprietary format, just call sqlite3_open()
on the file instead.  File->Save becomes a no-op.  There is no
parser to write.  Performance problems can often be fixed simply
by adding another index.  All of your data is written to disk 
atomically and is relatively safe from corruption even if you 
turn off the power during a write.  There are a lot of advantages
to this approach.

Lots of people are starting to use SQLite as an application
file format.  Remember the SQLite slogan:

SQLite is not a replacement for Oracle, it is a
replacement for fopen().

> 
> I'm quite interested in hearing people's reasoning for going the blob route, 
> when you have a perfectly good "database" format for "blobs" already (various 
> filesystems).
> 

Just yesterday, Eric Scouten posted on this list that he had done
a study and found that for BLOB smaller than 20-30K it was faster
to store them in an SQLite database than on disk.
--
D. Richard Hipp  <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-22 Thread drh
Ion Silvestru <[EMAIL PROTECTED]> wrote:
> >* In SQLite, my blobs won't be corrupted if the machine loses power
> >  the way they (probably) will be if I write my own code to access
> >  the file-system.
> 
> But, in case of a corruption, you will have entire blob DB corrupted versus
> at least one file (aka one row in DB) corrupted.
> 

Not.  SQLite databases do not corrupt when you turn the power off.
When power is restored and you attempt to access the database again,
the transaction that you were in the middle of at the point of the
power failure automatically rolls back, restoring the database to
a sane state.  This is one of the key reasons why you would want
to use SQLite instead of fopen() for storing miscellaneous data.

--
D. Richard Hipp  <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re[2]: [sqlite] Effect of blobs on performance

2007-02-21 Thread Ion Silvestru

>* In SQLite, my blobs won't be corrupted if the machine loses power
>  the way they (probably) will be if I write my own code to access
>  the file-system.

But, in case of a corruption, you will have entire blob DB corrupted versus
at least one file (aka one row in DB) corrupted.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re[2]: [sqlite] Effect of blobs on performance

2007-02-21 Thread Ion Silvestru

>>I'm fairly sure disk space requirements will be nearly identical in
>>each case...

In case of blobs in SQLite there will be less disk space used than in
case of file system (cluster size etc.)


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-21 Thread Dan Kennedy

> I'm quite interested in hearing people's reasoning for going the blob route, 
> when you have a perfectly good "database" format for "blobs" already (various 
> filesystems).



Three technical reasons for me personally:

* Can include blob operations as part of atomic transactions.

* In SQLite, my blobs won't be corrupted if the machine loses power
  the way they (probably) will be if I write my own code to access
  the file-system.

* Don't have to invent my own scheme for mapping from database entry
  to blob file, nor my own cross platform

And the claim that it's less work to program that way.

Better question IMO: What are peoples reasons for calling open() 
when you could just as easily call sqlite3_open()? :)



> > --
> > D. Richard Hipp  <[EMAIL PROTECTED]>
> >
> >
> > ---
> >-- To unsubscribe, send email to [EMAIL PROTECTED]
> > ---
> >--
> 
> 
> 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



RE: [sqlite] Effect of blobs on performance

2007-02-21 Thread Brett Keating
Thanks I think this answers my question well!

Brett 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, February 21, 2007 1:41 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Effect of blobs on performance

"Brett Keating" <[EMAIL PROTECTED]> wrote:
> Hi,
>  
> I'm curious about what the effect of having a blob in the database may

> be on performance. I have two design options: 1) put a small image 
> file
> (15-30kbyte) into the database as a blob, and 2) store the image in a 
> separate file on disk and hold the filename in the database. My table 
> has around 20 rows in it, about half are strings/smaller blobs and 
> half are integers.
>  
> Option number one, for various reasons, is far more elegant and simple

> in terms of its impact on the rest of the code. However, I am 
> concerned that holding such large amounts of data per record might 
> impact performance. I could be worried about nothing though, which is 
> why I'm writing to this list :).
>  

When I was designing the SQLite file format, I made the assumption that
BLOBs would be used infrequently and would not be very big.
The file format design is not optimized for storing BLOBs.  Indeed,
BLOBs are stored as a singly-linked list of database pages.  It is hard
to imagine a more inefficient design.

Much to my surprise, people begin putting multi-megabyte BLOBs in SQLite
databases and reporting that performance really was not an issue.  I
have lately taken up this practice myself and routinely uses SQLite
database with BLOBs that are over 10MiB is size.  And it all seems to
work pretty well here on my Linux workstation.  I have no explanation
for why it works so well, but it does so I'm not going to complain.

If your images are only 30KiB, you should have no problems.

Here's a hint though - make the BLOB columns the last column in your
tables.  Or even store the BLOBs in a separate table which only has two
columns: an integer primary key and the blob itself, and then access the
BLOB content using a join if you need to.
If you put various small integer fields after the BLOB, then SQLite has
to scan through the entire BLOB content (following the linked list of
disk pages) to get to the integer fields at the end, and that definitely
can slow you down.

--
D. Richard Hipp  <[EMAIL PROTECTED]>



-
To unsubscribe, send email to [EMAIL PROTECTED]

-




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



RE: [sqlite] Effect of blobs on performance

2007-02-21 Thread Brett Keating
Thanks for the quick reply. 

I was lysdexic, I meant to say 20 columns. Probably would never exceed
20,000 rows, most likely would hover around 2-4K rows in a typical
situation. 

If it has no effect on performance, I'd rather hold it in the database
because I do like the idea of having a "neat package" so to speak. For
instance when a record is deleted, I'd rather not worry about having to
check to see if it holds an image file and then go delete that file...
And if I delete many items at once with one statement, I'd have to break
it up into individual deletes and check each one in order to delete my
corresponding image file on disk... Otherwise I'll get these orphaned
image files all over the place on my disk that correspond to deleted
records. 

I'm fairly sure disk space requirements will be nearly identical in each
case... Just worried about query/update performance.

Thanks,
Brett 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of P
Kishor
Sent: Wednesday, February 21, 2007 1:38 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Effect of blobs on performance

On 2/21/07, Brett Keating <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I'm curious about what the effect of having a blob in the database may

> be on performance. I have two design options: 1) put a small image 
> file
> (15-30kbyte) into the database as a blob, and 2) store the image in a 
> separate file on disk and hold the filename in the database. My table 
> has around 20 rows in it, about half are strings/smaller blobs and 
> half are integers.
>
> Option number one, for various reasons, is far more elegant and simple

> in terms of its impact on the rest of the code. However, I am 
> concerned that holding such large amounts of data per record might 
> impact performance. I could be worried about nothing though, which is 
> why I'm writing to this list :).
>


unless you haven't told us the complete story, you are worried about
nothing probably. You have only 20 rows, fer crying out loud, you could
probably build an application with chalk and slate and it would be fast
enough ;-)

that said, there is much recurring discussion on this on another list
that I am on (folks wanting to store remote sensing imagery into
PostGres). Most concur that it is a ridiculous idea... just store the
metadata including the image path in the db, and then let the filesystem
do the heavy lifting. Usually one can't do much db level analysis on
blobs anyway, so storing them in the db becomes pointless other than the
fact that it is a nice neat single bundle. Well, wonder if you could
utilize the concept of packages the way they are on OS X.
>From the outside, a package looks like a file... you can grab it, copy
it, move it. Actually it is a folder in which you can have your core db
as well as all the images.

--
Puneet Kishor http://punkish.eidesis.org/ Nelson Inst. for Env. Studies,
UW-Madison http://www.nelson.wisc.edu/ Open Source Geospatial Foundation
http://www.osgeo.org/education/
-
collaborate, communicate, compete
=


-
To unsubscribe, send email to [EMAIL PROTECTED]

-




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-21 Thread Eric Scouten
My suggestion: Do a quick experiment. I had a similar question a year  
or so ago. I wrote some code a year or so that generated random blobs  
of varying sizes and tossed them into a SQLite DB and onto files on  
the file system (Mac OS).


There are some complicating variables, such as our application  
framework and some extra resource marshaling that might affect  
performance one way or another, but our results were that for blobs  
averaging <20-30K, SQLite was faster; for larger data sets, the file  
system was faster.


One thing to consider is how the presence of said blobs might impact  
performance of the database in general.


-Eric



On 21 Feb 2007, at 13:26, Brett Keating wrote:


Hi,

I'm curious about what the effect of having a blob in the database may
be on performance. I have two design options: 1) put a small image  
file

(15-30kbyte) into the database as a blob, and 2) store the image in a
separate file on disk and hold the filename in the database. My table
has around 20 rows in it, about half are strings/smaller blobs and  
half

are integers.

Option number one, for various reasons, is far more elegant and simple
in terms of its impact on the rest of the code. However, I am  
concerned

that holding such large amounts of data per record might impact
performance. I could be worried about nothing though, which is why I'm
writing to this list :).

Any advice or input?

Thanks,
Brett



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-21 Thread drh
"Brett Keating" <[EMAIL PROTECTED]> wrote:
> Hi,
>  
> I'm curious about what the effect of having a blob in the database may
> be on performance. I have two design options: 1) put a small image file
> (15-30kbyte) into the database as a blob, and 2) store the image in a
> separate file on disk and hold the filename in the database. My table
> has around 20 rows in it, about half are strings/smaller blobs and half
> are integers.
>  
> Option number one, for various reasons, is far more elegant and simple
> in terms of its impact on the rest of the code. However, I am concerned
> that holding such large amounts of data per record might impact
> performance. I could be worried about nothing though, which is why I'm
> writing to this list :).
>  

When I was designing the SQLite file format, I made the assumption
that BLOBs would be used infrequently and would not be very big.
The file format design is not optimized for storing BLOBs.  Indeed,
BLOBs are stored as a singly-linked list of database pages.  It
is hard to imagine a more inefficient design.

Much to my surprise, people begin putting multi-megabyte BLOBs
in SQLite databases and reporting that performance really was not
an issue.  I have lately taken up this practice myself and routinely
uses SQLite database with BLOBs that are over 10MiB is size.  And
it all seems to work pretty well here on my Linux workstation.  I
have no explanation for why it works so well, but it does so I'm not
going to complain.

If your images are only 30KiB, you should have no problems.

Here's a hint though - make the BLOB columns the last column in
your tables.  Or even store the BLOBs in a separate table which
only has two columns: an integer primary key and the blob itself,
and then access the BLOB content using a join if you need to.
If you put various small integer fields after the BLOB, then
SQLite has to scan through the entire BLOB content (following
the linked list of disk pages) to get to the integer fields at
the end, and that definitely can slow you down.

--
D. Richard Hipp  <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-