[sqlite] Of shared cache, table locks and transactions

2007-01-13 Thread Peter James

Hey folks...

I have a situation that caused me a little head-scratching and I'm wondering
if it's intended behavior or not.

I'm running a server thread (roughly based on test_server.c in the distro)
on top of the 3.3.6 library.  The effectve call sequence in question (all
from one thread) looks something like this:

sqlite3_open("/path/to/db", _one);

sqlite3_prepare(db_one, "CREATE TABLE foo (id);", -1, _one, NULL);
sqlite3_step(stmt_one);
sqlite3_finalize(stmt_one);

sqlite3_prepare(db_one, "BEGIN DEFERRED;", -1, _one, NULL);
sqlite3_step(stmt_one);
sqlite3_finalize(stmt_one);

sqlite3_prepare(db_one, "INSERT INTO foo VALUES (123);", -1, _one,
NULL);
sqlite3_step(stmt_one);
sqlite3_finalize(stmt_one);

sqlite3_prepare(db_one, "SELECT count(*) FROM foo;", -1, _one, NULL);
sqlite3_step(stmt_one);
// point of interest #1
sqlite3_column_int(stmt_one, 0);
sqlite3_finalize(stmt_one);

// new connection here, previous transaction still pending...
sqlite3_open("/path/to/db", _two);

sqlite3_prepare(db, "SELECT count(*) FROM foo;", -1, _two, NULL);
// point of interest #2
sqlite3_step(stmt_two);
// point of interest #3
sqlite3_column_int(stmt_two, 0);
sqlite3_finalize(stmt_two);

If shared cache is DISabled, then I get "1" on the first point of interest
and "0" on the third point of interest, which is what I'd expect.  The
database file is at a RESERVED lock state in both locations, and the first
point of interest gets uncommitted data since it's in the same connection,
while the second point of interest can't yet see that data since it's a
different connection and the transaction is not yet committed.

On the other hand, if shared cache is ENabled, then I get "1" on the first
point of interest and SQLITE_LOCKED at the second point of interest.  This
would seem to indicate an actual degradation of concurrency by using shared
caching.  Without shared caching, readers in the same thread as a pending
writer are allowed.  With shared caching, they are not.  The EXPLAIN output
seems to confirm that this is a result of the binary nature of table locks
vs. the staged nature of sqlite file locks.

This came up when I was running the JUnit tests for the SQLiteJDBC driver (
http://zentus.com/sqlitejdbc).  TransactionTest.testInsert() failed out in
the manner described above when using a server-based shared-cache backend.
I scanned the CVS logs and didn't see anything obvious between 3.3.6 and
3.3.10 that would change this behavior.

Thoughts?

Thanks,
Pete.


[sqlite] Re: How to optimize a select that gets 17 rows from a 700k row DB (via perl using DBI)?

2007-01-13 Thread Jonathan . W . Crane
First off, thanks for the help and sorry for the formatting of the 
message.  I didn't know how it was going to turn out and I probably was 
overly optimistic as well as too verbose.

Secondly, as I feared, seems like it was an XY question, so sorry for that 
as well.

I'll address the two replies I can see so far, and some of the info in 
each section will likely overlap.

Nico:

I guess that's the trick, to have the "current" or at least "recent" 
database and then the historical one.  As of now, the process of polling 
the 17 machines takes about 40 seconds or so (when I first started running 
the process minutely, it was 20, so you can see I have to do something 
soon :))

So assuming the two-db model, what's the trick to it?  Here are some ideas 
off the top of my head--can you (or any reader) please give me your 
thoughts (be as brutal as you like--I'm under no illusion that I know what 
I'm talking about):

1)  The "current" table only ever has 17 rows. 
a)Have some kind of thing built in to the script that runs 
minutely to copy the "current" data to the historical DB before kicking 
off the part that updates the current data.
b)Add a trigger to the DB where the SQLite engine takes care of 
the copy somehow--this would probably be more difficult since I don't know 
how to add a trigger and I am thinking that the historical database will 
be in a different file altogether.
c)Something I haven't thought of

2)  The current table is only allowed to have a maximum on N rows.  Upon 
reaching this size, data are moved to the historical database and only the 
most recent observations for each machine are left in the current DB.  Not 
sure how I could do that.  Is there a way to do this within SQLite?

3)  A job runs every night or week (at a time when people are least likely 
to be using the page such as 3 am) that transfers the data from the 
current DB to the historical, leaving only the most recent observation for 
each machine.

Jay:

The closer to real-time, the better.  The most often a cron job can run 
under Linux is minutely, and minutely is pretty good.  I guess I could 
have the summary process occur at the end of the script that polls the 
machines.  It could generate static HTML, which would presumably make the 
page load super fast.  However, under the current regime, the process of 
creating that summary is going to take at least 10 seconds.  40 seconds 
for polling + 10 seconds for summarizing=50 seconds, and that number is 
only going to get bigger!  So I'll have to figure out a better table 
structure anyway.

Additional thoughts:

In general, I think splitting the tables up is the way to go.  Any further 
comments/suggestions appreciated!

Jonathan






-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Re: A little help with count

2007-01-13 Thread Cesar Rodas

Great solution Igor!

On 13/01/07, Igor Tandetnik <[EMAIL PROTECTED]> wrote:


Lloyd Thomas
 wrote:
> I wish to create a query where I do a number of counts on the same
> table but
> with different filters.
> ie:
> count(id) as numrows
> count(id) as inrows where direction = 'In'
> count(id) as outrows where direction = 'Out'
>
> Could I do the above in a single query?

select
   count(*) as numrows,
   count(case direction when 'In' then 1 else NULL end) as inrows,
   count(case direction when 'Out' then 1 else NULL end) as outrows
from mytable;


Igor Tandetnik



-
To unsubscribe, send email to [EMAIL PROTECTED]

-





--
Cesar Rodas
http://www.sf.net/projects/pagerank (The PageRank made easy...)
Mobile Phone: 595 961 974165
Phone: 595 21 645590
[EMAIL PROTECTED]
[EMAIL PROTECTED]


[sqlite] Re: A little help with count

2007-01-13 Thread Igor Tandetnik

Lloyd Thomas
 wrote: 

I wish to create a query where I do a number of counts on the same
table but 
with different filters.

ie:
count(id) as numrows
count(id) as inrows where direction = 'In'
count(id) as outrows where direction = 'Out'

Could I do the above in a single query?


select
   count(*) as numrows,
   count(case direction when 'In' then 1 else NULL end) as inrows,
   count(case direction when 'Out' then 1 else NULL end) as outrows
from mytable;


Igor Tandetnik


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] A little help with count

2007-01-13 Thread Cesar Rodas

select count(id), "numrows" as type from table
union
select count(id), "inrows" as type from table where direction = 'In'
union
select count(id) as id, "outrows" as type from table where direction = 'Out'


This could be?
I am sorry that i can't try here, because i am not in my work or my home,
and here i don't have SQLite... try and tell if it works or not.

Best Regards

On 13/01/07, Lloyd Thomas <[EMAIL PROTECTED]> wrote:



I wish to create a query where I do a number of counts on the same table
but
with different filters.
ie:
count(id) as numrows
count(id) as inrows where direction = 'In'
count(id) as outrows where direction = 'Out'

Could I do the above in a single query?

| id |  date  | direction | duration | cost |
| 1 |2007-01-01|In| 56 | 0.00 |
| 2 |2007-01-01|   Out  | 60 | 0.10 |
| 3 |2007-01-02|   Out  | 47 | 0.10 |
| 4 |2007-01-02|In| 120   | 0.20 |


Thx
Lloydie T



-
To unsubscribe, send email to [EMAIL PROTECTED]

-





--
Cesar Rodas
http://www.sf.net/projects/pagerank (The PageRank made easy...)
Mobile Phone: 595 961 974165
Phone: 595 21 645590
[EMAIL PROTECTED]
[EMAIL PROTECTED]


[sqlite] A little help with count

2007-01-13 Thread Lloyd Thomas


I wish to create a query where I do a number of counts on the same table but 
with different filters.

ie:
count(id) as numrows
count(id) as inrows where direction = 'In'
count(id) as outrows where direction = 'Out'

Could I do the above in a single query?

| id |  date  | direction | duration | cost |
| 1 |2007-01-01|In| 56 | 0.00 |
| 2 |2007-01-01|   Out  | 60 | 0.10 |
| 3 |2007-01-02|   Out  | 47 | 0.10 |
| 4 |2007-01-02|In| 120   | 0.20 |


Thx
Lloydie T 



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] How to optimize a select that gets 17 rows from a 700k row DB (via perl using DBI)?

2007-01-13 Thread Jay Sprenkle

Have you considered running a process that summarizes the data from the table
and just fetching the last summary for display? Will a periodic snapshot work
for your reporting or do you need realtime summarization?


On 1/12/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

Hi all.  I will try to post all data I can think of.  The basic issue is
that I have a Perl CGI script that shows the most recent load stats for a
bunch of servers we use at our site for long, computing-intense programs.
The environment is all RedHat Enterprise Linux 4 ( 2.6.9-42.26.ELsmp #1
SMP i686 i686 i386 GNU/Linux)  SQLite version is 3.2.2.  As the number of
records in my database has increased, the page load time has also
increased, now to about 10 seconds, which is unacceptable, IMO.  The idea
is to give users an idea of the least loaded machine to help them make the
decision of which to use for their next program.  The whole concept of how
to best do this is, of course, in and of itself a complicated issue.  At
my org. there is no other system in place so this is how I do it.  I've
gotten a big positive response from users.  The last implementation used a
CSV file and each iteration of gathering the data from the servers only
allowed the cron job to run every five minutes.  After upgrading to
SQLite, the now minutely cron job has never failed to complete
successfully in under one minute that I know of.  I'd like to focus on
optimizing my current implementation, and then, if relevant, talk about
other, perhaps better ways to do it.  Currently, both the old, 5-minutely
CSV version and the current beta (SQLite) version run in parallel, and the
CSV version is still more widely used since the page actually loads in a
reasonable amount of time.  The CSV data are overwritten each time so the
process of reading in that data is always the same.  The SQLite database
is INSERTed into because I want to keep historical data.  The rationale
for this is explained later.

The current system uses a DB with a table called stats that has more or
less minutely data on various conditions for ~17 so-called peak machines.
The table can be described thus:

CREATE TABLE stats (Machine text not null,Load real,Scratch text(4), Mem
int, MemPctFree int, Procs int, Users int, Timestamp text(20) not null,
Message text);

My CGI script fetches the most recent observation for each machine with a
select statement of the form:

select a.* from stats a, (select Machine, max(Timestamp) as M from stats
group by machine) b where a.machine=b.machine and a.timestamp=b.M order by
load, Mem*MemPctFree desc, Scratch desc;

I floundered about trying to find a select statement that would simply
give me the right answer; I suspect the main opportunity for optimization
is in rewriting that statement.

The CGI script uses DBI and the exact statement is my $aref =
$dbh->selectall_arrayref($ss2); where $ss2 is the select statement above.

The idea being that the least loaded machines are first in the list.  The
usefulness of the stats reported are of course debatable, but again, let's
make sure the problem is addressed separately.  Some of the machines have
very large scratch partitions which make those machines particularly
suitable for jobs that generate very large files.

A typical group of rows might look like this:


Machine
Load
Scratch
space
available
Total
memory
(GB)
Free
memory
(%)
Running
processes
User(s)
Timestamp
Message
peaklx6
0
21G
12
96
1
1
[EMAIL PROTECTED]:28:35

peaklx1
0
58G
12
88
1
1
[EMAIL PROTECTED]:29:03

peaklx5
0
19G
12
86
1
1
[EMAIL PROTECTED]:28:33

peaklx4
0

12
84
1
4
[EMAIL PROTECTED]:28:31

peaklx2
0
59G
12
65
1
2
[EMAIL PROTECTED]:29:05

peaklx21
0

4
14
1
2
[EMAIL PROTECTED]:29:07

peaklx3
0.1

12
78
1
3
[EMAIL PROTECTED]:28:27

fstlx1
0.2
20G
4
5
1
11
[EMAIL PROTECTED]:28:37

peaklx22
1

4
53
2
0
[EMAIL PROTECTED]:29:09

peaklx24
2

4
58
3
0
[EMAIL PROTECTED]:29:12

peaklx29
2

4
40
3
3
[EMAIL PROTECTED]:28:25

peaklx23
2

4
39
3
0
[EMAIL PROTECTED]:29:11

peaklx30
2

4
29
3
0
[EMAIL PROTECTED]:28:29

peaklx28
2

4
28
3
0
[EMAIL PROTECTED]:28:23

peaklx26
3

12
69
4
0
[EMAIL PROTECTED]:29:17

peaklx25
4

4
21
5
0
[EMAIL PROTECTED]:29:15

peaklx27
4

4
17
5
2
[EMAIL PROTECTED]:28:21


The problem I am having is that the page load times are getting very, very
long.  The table now has about 700,000 columns and the select statement is
taking about 10 seconds to execute (I benchmarked various parts of the
program and it's not the DB connect or loading the modules or whatnot).
The DB file lives on an NFS mounted share but running the same query
locally and interactively and both all take approximately the same amount
of time.  The size of the DB file itself is currently 89MB.  The web
server (and all the servers in question) all are dual processor 2.8GHz
Xeons with a minimum of 4GB of memory (all running the same OS).  Some
have hyperthreading turned on and others don't.

I tried putting a copy of the DB file in the same directory as the script;

Re: [sqlite] 3.3.10 data corruption on updating fts1 string table

2007-01-13 Thread ohadp

Glad I could help find something, hope I put you in the right direction with
the source sample.
In the meantime I've moved to standard tables due to other limitations
imposed by FTS1/2.


Scott Hess wrote:
> 
> OK, there's definite meat, here.  I have other reports of users seeing
> this problem.  It's specifically related to doing UPDATE against an
> fts1 or fts2 table.  INSERT and DELETE both work fine.  As far as I
> can tell, UPDATE may have never worked, or may have worked only in
> specific circumstances.  More as it's available.
> 
> -scott
> 
> 
> On 1/12/07, ohadp <[EMAIL PROTECTED]> wrote:
>>
>> looks like the file came in without CRLF, here goes:
>>
>> --
>> #include "../sqlite-3_3_8/sqlite3.h"
>> #include 
>>
>> static sqlite3* db;
>>
>> void exec_dml(const TCHAR* cmd)
>> {
>> sqlite3_stmt*   vm;
>>
>> sqlite3_prepare16(db, cmd, -1, , 0);
>> sqlite3_step(vm);
>> sqlite3_finalize(vm);
>> }
>>
>> void exec_query(const TCHAR* cmd)
>> {
>> sqlite3_stmt*   vm;
>>
>> sqlite3_prepare16(db, cmd, -1, , 0);
>> if (sqlite3_step(vm) == SQLITE_ROW)
>> {
>> TCHAR* result = (TCHAR*)sqlite3_column_text16(vm, 0);
>> result=result;
>> }
>>
>> sqlite3_finalize(vm);
>> }
>>
>> int _tmain(int argc, _TCHAR* argv[])
>> {
>> sqlite3_open16(_T("test.db"), );
>> exec_dml(_T("CREATE VIRTUAL TABLE t USING fts1(content);"));
>> exec_dml(_T("INSERT INTO t (rowid, content) VALUES (1, 'this is a
>> test');"));
>>
>> exec_query(_T("SELECT content FROM t WHERE rowid = 1;"));
>> exec_dml(_T("UPDATE t SET content = 'that was a test' WHERE rowid
>> = 1;"));
>> exec_query(_T("SELECT content FROM t WHERE rowid = 1;"));
>>
>> return 0;
>> }
>> --
>> --
>> View this message in context:
>> http://www.nabble.com/3.3.10-data-corruption-on-updating-fts1-string-table-tf2960926.html#a8305111
>> Sent from the SQLite mailing list archive at Nabble.com.
>>
>>
>> -
>> To unsubscribe, send email to [EMAIL PROTECTED]
>> -
>>
>>
> 
> -
> To unsubscribe, send email to [EMAIL PROTECTED]
> -
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/3.3.10-data-corruption-on-updating-fts1-string-table-tf2960926.html#a8312740
Sent from the SQLite mailing list archive at Nabble.com.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] SQLITE_ENABLE_LOCKING_STYLE

2007-01-13 Thread Marco Bambini
I am sorry if the question is too obvious but I haven't found any  
documentation about the SQLITE_ENABLE_LOCKING_STYLE macro (it is in  
os.c).

Should I set it to 0 (default) or 1?
I need to be able to access database files located on shared volumes  
on OS X.

Which are the implications of setting it to 1?

Thanks a lot.
---
Marco Bambini
http://www.sqlabs.net
http://www.sqlabs.net/blog/
http://www.sqlabs.net/realsqlserver/




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] How to optimize a select that gets 17 rows from a 700k row DB (via perl using DBI)?

2007-01-13 Thread Nicolas Williams
On Sat, Jan 13, 2007 at 12:57:43AM -0500, [EMAIL PROTECTED] wrote:
>  The SQLite database 
> is INSERTed into because I want to keep historical data.  The rationale 
> for this is explained later.

For your main application (finding the least loaded machine) you don't
need historical data, so put the historical data in a separate table or
else add a column that you'll set to NULL for historical data and which
will be part of an index/primary key so that historical data is not
indexed.  And since you have so few "live" rows you might as well just
have two tables and for the non-historical table don't even bother with
indexes.

Nico
-- 

-
To unsubscribe, send email to [EMAIL PROTECTED]
-