Re[2]: [sqlite] Effect of blobs on performance

2007-02-21 Thread Ion Silvestru

>* In SQLite, my blobs won't be corrupted if the machine loses power
>  the way they (probably) will be if I write my own code to access
>  the file-system.

But, in case of a corruption, you will have entire blob DB corrupted versus
at least one file (aka one row in DB) corrupted.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re[2]: [sqlite] Effect of blobs on performance

2007-02-21 Thread Ion Silvestru

>>I'm fairly sure disk space requirements will be nearly identical in
>>each case...

In case of blobs in SQLite there will be less disk space used than in
case of file system (cluster size etc.)


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Precision of dates stores as Julian "real"

2007-02-21 Thread Martin Jenkins

[EMAIL PROTECTED] wrote:

Guess you can't please everybody :-)  Right now we have some
documentation in the source tree and some on the wiki, which
I suppose is guaranteed to please nobody.


So make the wiki available for download. ;)

Martin

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-21 Thread Dan Kennedy

> I'm quite interested in hearing people's reasoning for going the blob route, 
> when you have a perfectly good "database" format for "blobs" already (various 
> filesystems).



Three technical reasons for me personally:

* Can include blob operations as part of atomic transactions.

* In SQLite, my blobs won't be corrupted if the machine loses power
  the way they (probably) will be if I write my own code to access
  the file-system.

* Don't have to invent my own scheme for mapping from database entry
  to blob file, nor my own cross platform

And the claim that it's less work to program that way.

Better question IMO: What are peoples reasons for calling open() 
when you could just as easily call sqlite3_open()? :)



> > --
> > D. Richard Hipp  <[EMAIL PROTECTED]>
> >
> >
> > ---
> >-- To unsubscribe, send email to [EMAIL PROTECTED]
> > ---
> >--
> 
> 
> 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Using AVG() Correctly

2007-02-21 Thread miguel manese

On 2/22/07, Rich Shepard <[EMAIL PROTECTED]> wrote:

   However, I'm stuck on the proper SQL syntax. A nudge in the right
direction -- including pointers to the appropriate documentation -- would be
much appreciated.

The "rule of thumb" is that anything that appears in the group-by
clause can appear directly in the select, while other columns are
aggregated in the select. Best example using your particular case is
Igor's answer:

select cat, pos, avg(col1), avg(col2), ...
from voting
group by cat, pos;

M. Manese

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] Re: Using AVG() Correctly

2007-02-21 Thread Igor Tandetnik

Rich Shepard <[EMAIL PROTECTED]> wrote:

  I have a table, 'voting,' with 31 columns. For each of 28 REAL
columns I need to calculate averages both by groups and total. I
tried:
sqlite> select AVG(pos) from voting where cat = 'eco';

and 0.0 was returned.


The query looks good. What's the data in the pos column? Could it be 
that the average is indeed zero?



  What I need to do with the data from table 'voting' is (in Python
using psqlite2) is to calculate the average of each of 28 numeric
columns for each of the three 'pos' column values within each of the
three 'cat' column values.


select cat, pos, avg(col1), avg(col2), ...
from voting
group by cat, pos;

Igor Tandetnik 



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] about default file permission of SQLite database file

2007-02-21 Thread miguel manese

This is not actually about SQLite. man umask

M. Manese

On 2/22/07, Shan, Zhe (Jay) <[EMAIL PROTECTED]> wrote:

Hi,

If to use SQLite to create a database in Linux, the database file will
be granted permission 644 as default.
Is this value hardcoded in the current version? Is it possible to
change this default vaule, say to 664 or something else?


Thanks.

Jay



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] about default file permission of SQLite database file

2007-02-21 Thread Shan, Zhe (Jay)

Hi,

If to use SQLite to create a database in Linux, the database file will
be granted permission 644 as default.
Is this value hardcoded in the current version? Is it possible to
change this default vaule, say to 664 or something else?


Thanks.

Jay


Re: [sqlite] Precision of dates stores as Julian "real"

2007-02-21 Thread Dennis Cote

[EMAIL PROTECTED] wrote:


Sure you can.  You just have to put the expression in parentheses
(to avoid a parsing conflict).  Try this:

CREATE TABLE test1(
  date TEXT DEFAULT (strftime('%Y-%m-%d %H:%M:%f', 'now')),
  dummy int
);
INSERT INTO test1(dummy) VALUES(1);
SELECT * FROM test1;

  
Oh. OK then. I just went by the documentation at 
http://www.sqlite.org/lang_createtable.html. This isn't mentioned there 
at all.


Would you like a bug report for the documentation, or will you just fix it?
I would suggest that adding a small amount of code bloat to accept 
another magic default value that stores the high resolution julian day 
number



This would break backwards compatibility.  It would not be a major
break but it could possible cause problems for some users.
  
How would adding a new JULIAN_TIMESTAMP keyword which is a magic alias 
of julianday('now') in the same fashion as CURRENT_TIMESTAMP is an alias 
for datetime('now') break backwards compatibility? The new keyword would 
only be used in new code. The old one would continue to work as it does 
now in old code. The benefit would be smaller more flexible databases 
for those who adopt the new functionality. There would be no change for 
those who don't.



I need to start a wiki page on proposed minor incompatibilities
so that we I go from 3.3.x to 3.4.0 I can be sure to get them all.
What was that one we discussed on this mailing earlier this week?

  
I agree with this. I remember something else was discussed too, but I'm 
not sure what it was either.


I looked back and found that it was the PRIMARY KEY and NOT NULL issue 
raised by Eric Johansson.


Dennis Cote



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



RE: [sqlite] Effect of blobs on performance

2007-02-21 Thread Brett Keating
Thanks I think this answers my question well!

Brett 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, February 21, 2007 1:41 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Effect of blobs on performance

"Brett Keating" <[EMAIL PROTECTED]> wrote:
> Hi,
>  
> I'm curious about what the effect of having a blob in the database may

> be on performance. I have two design options: 1) put a small image 
> file
> (15-30kbyte) into the database as a blob, and 2) store the image in a 
> separate file on disk and hold the filename in the database. My table 
> has around 20 rows in it, about half are strings/smaller blobs and 
> half are integers.
>  
> Option number one, for various reasons, is far more elegant and simple

> in terms of its impact on the rest of the code. However, I am 
> concerned that holding such large amounts of data per record might 
> impact performance. I could be worried about nothing though, which is 
> why I'm writing to this list :).
>  

When I was designing the SQLite file format, I made the assumption that
BLOBs would be used infrequently and would not be very big.
The file format design is not optimized for storing BLOBs.  Indeed,
BLOBs are stored as a singly-linked list of database pages.  It is hard
to imagine a more inefficient design.

Much to my surprise, people begin putting multi-megabyte BLOBs in SQLite
databases and reporting that performance really was not an issue.  I
have lately taken up this practice myself and routinely uses SQLite
database with BLOBs that are over 10MiB is size.  And it all seems to
work pretty well here on my Linux workstation.  I have no explanation
for why it works so well, but it does so I'm not going to complain.

If your images are only 30KiB, you should have no problems.

Here's a hint though - make the BLOB columns the last column in your
tables.  Or even store the BLOBs in a separate table which only has two
columns: an integer primary key and the blob itself, and then access the
BLOB content using a join if you need to.
If you put various small integer fields after the BLOB, then SQLite has
to scan through the entire BLOB content (following the linked list of
disk pages) to get to the integer fields at the end, and that definitely
can slow you down.

--
D. Richard Hipp  <[EMAIL PROTECTED]>



-
To unsubscribe, send email to [EMAIL PROTECTED]

-




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



RE: [sqlite] Effect of blobs on performance

2007-02-21 Thread Brett Keating
Thanks for the quick reply. 

I was lysdexic, I meant to say 20 columns. Probably would never exceed
20,000 rows, most likely would hover around 2-4K rows in a typical
situation. 

If it has no effect on performance, I'd rather hold it in the database
because I do like the idea of having a "neat package" so to speak. For
instance when a record is deleted, I'd rather not worry about having to
check to see if it holds an image file and then go delete that file...
And if I delete many items at once with one statement, I'd have to break
it up into individual deletes and check each one in order to delete my
corresponding image file on disk... Otherwise I'll get these orphaned
image files all over the place on my disk that correspond to deleted
records. 

I'm fairly sure disk space requirements will be nearly identical in each
case... Just worried about query/update performance.

Thanks,
Brett 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of P
Kishor
Sent: Wednesday, February 21, 2007 1:38 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Effect of blobs on performance

On 2/21/07, Brett Keating <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I'm curious about what the effect of having a blob in the database may

> be on performance. I have two design options: 1) put a small image 
> file
> (15-30kbyte) into the database as a blob, and 2) store the image in a 
> separate file on disk and hold the filename in the database. My table 
> has around 20 rows in it, about half are strings/smaller blobs and 
> half are integers.
>
> Option number one, for various reasons, is far more elegant and simple

> in terms of its impact on the rest of the code. However, I am 
> concerned that holding such large amounts of data per record might 
> impact performance. I could be worried about nothing though, which is 
> why I'm writing to this list :).
>


unless you haven't told us the complete story, you are worried about
nothing probably. You have only 20 rows, fer crying out loud, you could
probably build an application with chalk and slate and it would be fast
enough ;-)

that said, there is much recurring discussion on this on another list
that I am on (folks wanting to store remote sensing imagery into
PostGres). Most concur that it is a ridiculous idea... just store the
metadata including the image path in the db, and then let the filesystem
do the heavy lifting. Usually one can't do much db level analysis on
blobs anyway, so storing them in the db becomes pointless other than the
fact that it is a nice neat single bundle. Well, wonder if you could
utilize the concept of packages the way they are on OS X.
>From the outside, a package looks like a file... you can grab it, copy
it, move it. Actually it is a folder in which you can have your core db
as well as all the images.

--
Puneet Kishor http://punkish.eidesis.org/ Nelson Inst. for Env. Studies,
UW-Madison http://www.nelson.wisc.edu/ Open Source Geospatial Foundation
http://www.osgeo.org/education/
-
collaborate, communicate, compete
=


-
To unsubscribe, send email to [EMAIL PROTECTED]

-




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-21 Thread Eric Scouten
My suggestion: Do a quick experiment. I had a similar question a year  
or so ago. I wrote some code a year or so that generated random blobs  
of varying sizes and tossed them into a SQLite DB and onto files on  
the file system (Mac OS).


There are some complicating variables, such as our application  
framework and some extra resource marshaling that might affect  
performance one way or another, but our results were that for blobs  
averaging <20-30K, SQLite was faster; for larger data sets, the file  
system was faster.


One thing to consider is how the presence of said blobs might impact  
performance of the database in general.


-Eric



On 21 Feb 2007, at 13:26, Brett Keating wrote:


Hi,

I'm curious about what the effect of having a blob in the database may
be on performance. I have two design options: 1) put a small image  
file

(15-30kbyte) into the database as a blob, and 2) store the image in a
separate file on disk and hold the filename in the database. My table
has around 20 rows in it, about half are strings/smaller blobs and  
half

are integers.

Option number one, for various reasons, is far more elegant and simple
in terms of its impact on the rest of the code. However, I am  
concerned

that holding such large amounts of data per record might impact
performance. I could be worried about nothing though, which is why I'm
writing to this list :).

Any advice or input?

Thanks,
Brett



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Effect of blobs on performance

2007-02-21 Thread drh
"Brett Keating" <[EMAIL PROTECTED]> wrote:
> Hi,
>  
> I'm curious about what the effect of having a blob in the database may
> be on performance. I have two design options: 1) put a small image file
> (15-30kbyte) into the database as a blob, and 2) store the image in a
> separate file on disk and hold the filename in the database. My table
> has around 20 rows in it, about half are strings/smaller blobs and half
> are integers.
>  
> Option number one, for various reasons, is far more elegant and simple
> in terms of its impact on the rest of the code. However, I am concerned
> that holding such large amounts of data per record might impact
> performance. I could be worried about nothing though, which is why I'm
> writing to this list :).
>  

When I was designing the SQLite file format, I made the assumption
that BLOBs would be used infrequently and would not be very big.
The file format design is not optimized for storing BLOBs.  Indeed,
BLOBs are stored as a singly-linked list of database pages.  It
is hard to imagine a more inefficient design.

Much to my surprise, people begin putting multi-megabyte BLOBs
in SQLite databases and reporting that performance really was not
an issue.  I have lately taken up this practice myself and routinely
uses SQLite database with BLOBs that are over 10MiB is size.  And
it all seems to work pretty well here on my Linux workstation.  I
have no explanation for why it works so well, but it does so I'm not
going to complain.

If your images are only 30KiB, you should have no problems.

Here's a hint though - make the BLOB columns the last column in
your tables.  Or even store the BLOBs in a separate table which
only has two columns: an integer primary key and the blob itself,
and then access the BLOB content using a join if you need to.
If you put various small integer fields after the BLOB, then
SQLite has to scan through the entire BLOB content (following
the linked list of disk pages) to get to the integer fields at
the end, and that definitely can slow you down.

--
D. Richard Hipp  <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Precision of dates stores as Julian "real"

2007-02-21 Thread drh
"P Kishor" <[EMAIL PROTECTED]> wrote:
> 
> is it possible to add usage such as the above, and many, many
> wonderful SQL suggestions routinely provided by Igor Tandetnik (thanks
> Igor!) to the syntax docs in the form of user-submitted comments?
> 

I was trying to move all of the documentation into wiki for
exactly this reason.  But then people said they didn't like
that and wanted all the documentation in the source tree.
Guess you can't please everybody :-)  Right now we have some
documentation in the source tree and some on the wiki, which
I suppose is guaranteed to please nobody.
--
D. Richard Hipp  <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Precision of dates stores as Julian "real"

2007-02-21 Thread P Kishor

On 2/21/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

Dennis Cote <[EMAIL PROTECTED]> wrote:
>
> The problem is that you can't use a function like strftime as the
> default value for a column when you create a tbale. It only accepts
> NULL, a string constant, a number, or one of the magic current_* values.
>

Sure you can.  You just have to put the expression in parentheses
(to avoid a parsing conflict).  Try this:

CREATE TABLE test1(
  date TEXT DEFAULT (strftime('%Y-%m-%d %H:%M:%f', 'now')),
  dummy int
);
INSERT INTO test1(dummy) VALUES(1);
SELECT * FROM test1;


..


I need to start a wiki page on proposed minor incompatibilities
so that we I go from 3.3.x to 3.4.0 I can be sure to get them all.
What was that one we discussed on this mailing earlier this week?



is it possible to add usage such as the above, and many, many
wonderful SQL suggestions routinely provided by Igor Tandetnik (thanks
Igor!) to the syntax docs in the form of user-submitted comments?

--
Puneet Kishor http://punkish.eidesis.org/
Nelson Inst. for Env. Studies, UW-Madison http://www.nelson.wisc.edu/
Open Source Geospatial Foundation http://www.osgeo.org/education/
-
collaborate, communicate, compete
=

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Precision of dates stores as Julian "real"

2007-02-21 Thread drh
Dennis Cote <[EMAIL PROTECTED]> wrote:
> 
> The problem is that you can't use a function like strftime as the 
> default value for a column when you create a tbale. It only accepts 
> NULL, a string constant, a number, or one of the magic current_* values.
> 

Sure you can.  You just have to put the expression in parentheses
(to avoid a parsing conflict).  Try this:

CREATE TABLE test1(
  date TEXT DEFAULT (strftime('%Y-%m-%d %H:%M:%f', 'now')),
  dummy int
);
INSERT INTO test1(dummy) VALUES(1);
SELECT * FROM test1;

> 
> I would suggest that adding a small amount of code bloat to accept 
> another magic default value that stores the high resolution julian day 
> number

This would break backwards compatibility.  It would not be a major
break but it could possible cause problems for some users.

I need to start a wiki page on proposed minor incompatibilities
so that we I go from 3.3.x to 3.4.0 I can be sure to get them all.
What was that one we discussed on this mailing earlier this week?

--
D. Richard Hipp  <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Precision of dates stores as Julian "real"

2007-02-21 Thread Dennis Cote

[EMAIL PROTECTED] wrote:


This isn't a bug. 


The magic current_timestamp keyword is really an alias for
"datetime('now')".  And datetime('now') returns you a text
string in the format "YY-MM-DD HH:MM:SS".  Yes, this means
that the seconds have been rounded to the nearest whole
second.  But that is what current_timestamp keyword is
documented to do.  Imagine the outcry and all the broken code 
if current_timestamp suddenly started returning dates
in the format "YY-MM-DD HH:MM:SS.SSS".  


Would could think about adding a new magic keyword,
perhaps current_hires_timestamp, that included the fractional
seconds.  Such a magic keywords would be an alias
for strftime('%Y-%m-%d %H:%M:%f','now'). But adding this
new keyword will be a hard sell since there is a lot of
emphasis on avoiding code bloat.  So for now, you are better
off just using the strftime function if you need a high
resolution timestamp.  

Actually, current_timestamp (and datetime) returns the full 4 digit year.

The problem is that you can't use a function like strftime as the 
default value for a column when you create a tbale. It only accepts 
NULL, a string constant, a number, or one of the magic current_* values.



Or (and better in my view) store
your dates as julian day numbers and convert to a ISO8601
for display only.
  


I thought that was what the current_timestamp did!

You are saying that SQLite really stores the 19 byte date and time 
string instead of the 8 byte julian day number as a floating point 
double. This seems awfully wasteful. I can see where this would be of 
benefit to systems without floating point, so maybe there is a good 
reason for this, but it is not what I expected.


I suspect a few others might be expecting finer resolution from the 
current_timestamp when used to add a create or modified timestamp to a 
record. The current behavior seems to increase the likelyhood of 
timestamp conflicts in such applications.


I would suggest that adding a small amount of code bloat to accept 
another magic default value that stores the high resolution julian day 
number would be worth the cost for both database size reduction and 
increased resolution. The user can always for this time anyway they like 
for display in the select statements where arbitrary functions are 
allowed. I think the best name  (current_timestamp) is already taken, so 
I would suggest julian_timestamp. It would have been better if the 
original magic values were current_date, current_time, and 
current_datetime, then the new value would logically be the 
current_timestamp, but that can't be changed now.


Dennis Cote






-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Re: Unexpected Query Results

2007-02-21 Thread Rich Shepard

On Wed, 21 Feb 2007, Igor Tandetnik wrote:


Yes, that's what GROUP BY does. One representative for each group.


Igor,

  A-ha! It has been a long time for me.


select * from voting order by cat, pos;


  Thank you very much. Makes sense now.

Rich

--
Richard B. Shepard, Ph.D.   |The Environmental Permitting
Applied Ecosystem Services, Inc.|  Accelerator(TM)
 Voice: 503-667-4517  Fax: 503-667-8863

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] Re: Unexpected Query Results

2007-02-21 Thread Igor Tandetnik

Rich Shepard <[EMAIL PROTECTED]> wrote:

  The table has 180 rows and 31 columns. What I need to do is extract
the records and group them by two columns (one as a sub-group of the
other). However, even one 'group by' retrieves only three records,
the last one for each group:


Yes, that's what GROUP BY does. One representative for each group.


  What I expected was all 180 records, with 60 in each of the 'cat'
fields (second field above).


select * from voting order by cat;


  Ultimately, I need to group them by the second field (name ==
'cat') and within each of those, sub-group them by the third field
(name == 'pos').


select * from voting order by cat, pos;

Igor Tandetnik

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] types3-1.3 fails on x86_64

2007-02-21 Thread drh
Alexey Tourbin <[EMAIL PROTECTED]> wrote:
> Hello,
> 
> All tests pass on i386 but the following test fails on x86_64:
> 
> types3-1.3...
> Expected: [wideInt integer]
>  Got: [int integer]

This is a failure in the test harness, not in SQLite itself.
This is nothing to worry about.
--
D. Richard Hipp  <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] types3-1.3 fails on x86_64

2007-02-21 Thread Alexey Tourbin
Hello,

All tests pass on i386 but the following test fails on x86_64:

types3-1.3...
Expected: [wideInt integer]
 Got: [int integer]


pgpOit3QhrHc1.pgp
Description: PGP signature