[sqlite] Using SQLite to record Web log file data (a threading question)

2006-10-28 Thread David Gewirtz

I've been exploring SQLite for a number of applications, but one I'd like to
embark on soon is using SQLite to record Web site log file data, so I can
perform SQL-based analysis on my logs, rather than using some thing like
Analog.

Unfortunately, each Web access on the server is likely to be in its own
thread. The scenario I'd probably have to build might go something like
this:

* Web server launches
* SQLite issues sqlite3_open to log db, gets log_ID
* Web server handles user 1 in thread 1, which writes to log db
* Web server handles user 2 in thread 2, which writes to log db
* Web server handles user n in thread n, which writes to log db
* Web server handles admin request for log analysis, which reads from log db
* Web server begins shutdown, closes log db
* Web server shuts down

>From my reading, it's just not clear to me whether this is bad behavior for
SQLite 3.3.7. Can SQLite handle this sort of sequence reliably. If not, any
suggestions about how I might proceed or how I should think about it?

Thanks,
David



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Re: Regarding sqlite3_exec

2006-10-28 Thread Isaac Raway

You cannot even consider loading even a thousand records directly.

Get a set of ID numbers. Load each record as it's needed. This is very
basic stuff, and not even that hard to implement. I am just saying for
the record that this is not hard to do, hopefully no one else will be
scared away from the concept.

As a general rule of coding, do it right the first time. One thread
reading a list of integers while another is on the GUI is not complex.

Isaac

On 10/28/06, Da Martian <[EMAIL PROTECTED]> wrote:

Hi

Thanks for the reposnse. The main reason is my record count could be from a
few thousands to a million. But even at the lowly numbers of around 1
the interface can seem slugish if you read every record before displaying
anything.

As you mention, and has been disucssed above, doing stuff in the background
is good way to go, but more complex. As a generla rule of coding I put as
few unneccessary threads into a "phase 1" program as I can, because the
complexity goes up hugly, threads can be complex to use, co-ordinate,
interrupt etc... and chance of bugs goes up drmatically. So I tend to do
that for a "Phase 2" - Bells and whistles phase and only when there isnt a
simpler way.

This thread has covered just about all approaches I can think of :-) thanks
for the reponses.

S


On 10/27/06, Isaac Raway <[EMAIL PROTECTED]> wrote:
>
> Why don't you design the table with a unique row ID, stored in an
> integer field, then fetch a list of those ID numbers?
>
> For 5000 rows, assuming you store them in you application as 4 byte
> longs, that's about 19 k of memory.
>
> Counting that result as you receive it isn't that difficult. If it
> takes a long time (it probably won't) you can do it in another thread
> and update the interface as appropriate.
>
> I'm not seeing a downside here.
>
> Isaac
>
> On 10/26/06, Da Martian <[EMAIL PROTECTED]> wrote:
> > No there isnt, but RDBM systems are a generalised data retrieval
> mechanism.
> > As such they suffer from that generality.
> >
> > Dont get me wrong, RDBM systems  are appropriate for 95% of all data
> > requirements I have had to deal with and I would never dream of trying
> to
> > write one from scratch, nor can I imagine a world without them.
> >
> > However certain applications (Weather data, Gnome data, Large indices
> (like
> > google)) require using somethng designed specifically for that purpose.
> If
> > you customise data retrieval (and particluar your sorting/indcies/access
> > path) you can leave rdbms in the dust in terms of performance. All I
> have
> > read about google, suggests they do exactly this. Although I must point
> out,
> > I dont actually know anything about google with any certainty. Just what
> has
> > "leaked" out over the years on the rumour mill. But designiing my own
> > "google" like indices (on a smaller scale of coure) and some
> specialisted
> > weather stuff, it neccessary to throw away the rdbms and do it yourself.
> For
> > a goole query for instance, they know they will get a list of 1 or more
> > words. They also know they will only ever search through the index of
> words.
> > They dont have other data types, records or tables. Why go through all
> the
> > hassles of compiling SQLs, and that generic overhead when your
> application
> > will only ever do one thing? You can just make an API like this
> > "search(wordlist): Resultset. "
> >
> > You immediatly save yourself complexity and processing time. Then for
> large
> > indices you will know your data set, so instead of using a std BTree you
> > would use a more appropraite DS possible with skip lists etc..
> > .
> > As for performing a database search twice, this whole thread has shown,
> that
> > sometimes the you have to :-)
> >
> > S
> >
> > On 10/25/06, John Stanton <[EMAIL PROTECTED]> wrote:
> > >
> > > There is no magic in data retrieval.  Google use the same physical
> laws
> > > as us ordinary mortals.
> > >
> > > I see no reason to ever perform a dataabase search twice.
> > >
> >
> >
>
>
> --
> Isaac Raway
> Entia non sunt multiplicanda praeter necessitatem.
>
> http://blueapples.org - blog
> http://stonenotes.com - personal knowledge management
>
>
> -
> To unsubscribe, send email to [EMAIL PROTECTED]
>
> -
>
>





--
Isaac Raway
Entia non sunt multiplicanda praeter necessitatem.

http://blueapples.org - blog
http://stonenotes.com - personal knowledge management

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] serious performance problems with indexes

2006-10-28 Thread Joe Wilson
If you pre-sort the data prior to inserting into sqlite you will
see much better timings for both the default cache size and
the larger cache size.

When I run your script (unmodified, except for the removal of 
MySQL) up to 40:

tablesize   sqlite  sqlite-largecache
0   1.956   1.956
2   2.064   2.079
4   2.095   2.111
6   2.173   2.173
8   2.189   2.283
10  2.345   2.298
12  2.345   2.360
14  2.407   2.361
16  2.470   2.423
18  2.548   2.501
20  2.564   2.439
22  2.830   2.439
24  2.876   2.907
26  2.829   2.564
28  3.423   3.533
30  4.939   3.564
32  7.236   3.736
34  7.283   3.751
36  10.611  3.767
38  11.142  3.845
40  13.736  3.798

When I make the following change to your script to simulate an ordered data set
(okay, it's cheating slightly - no pre-sort need be performed here):

   # set r [expr {int(rand()*50)}]
   set r [expr {($num+$i)*5}]

tablesize   sqlite  sqlite-largecache
0   1.878   1.894
2   1.925   1.923
4   1.923   1.923
6   1.923   1.954
8   1.970   1.923
10  1.923   1.908
12  1.923   1.970
14  1.940   2.383
16  2.048   1.908
18  1.923   2.002
20  1.923   1.923
22  1.939   1.923
24  1.923   1.923
26  1.939   1.923
28  1.939   1.939
30  1.954   1.939
32  1.939   1.923
34  1.970   1.954
36  1.939   1.954
38  1.923   1.939
40  1.970   1.939

I wonder if other databases pre-sort their batch inserts by 
index order to achieve more consistant insert speed with 
datasets that exceed cache size.


- Original Message 
From: Peter De Rijk <[EMAIL PROTECTED]>
To: sqlite-users@sqlite.org
Sent: Saturday, October 28, 2006 10:40:03 AM
Subject: Re: [sqlite] serious performance problems with indexes

I have checked more fine grained (results and code at the end of the  
mail), and it seems indeed that the extreme growth of time needed 
for inserts only starts at around 16 rows (on my system). Before 
that, it also rises, but much much more slowly. As this indeed 
looked like a cache effect, I tried setting the default_cache_size 
to something big (2). This was enough to remove the bend (at 
least till 200 rows), so that the increase in time kept going 
at the same slower pace as for the smaller data sizes.

Curiously, mysql which I included as a reference, manages to 
keep nearly the same insert time even for the largest data sets.
Using the large cache size, sqlite is faster than mysql until about 
70 rows. At 200 rows mysql is about 3 times faster than sqlite
with large cache (vs more than 20 times faster than sqlite with 
default cache size).

--- 
tablesizesqlitesqlite-largecache mysql
01.121.042.86
21.091.172.89
41.141.153.27
61.231.23.18
81.471.253.01
101.31.322.94
121.41.342.94
142.061.383
162.411.462.99
183.091.543.03
203.311.623.04
223.841.723.06
244.221.73.03
264.231.763.06
284.831.942.98
306.272.083.07
329.022.063.11
349.481.942.88
36102.133.04
3810.62.213.13
4010.722.353.05
4211.352.243.11
4411.552.493.05
4611.752.53.07
4816.892.393.1
5017.812.593.09
5219.222.43.03
5419.352.773.05
5619.442.593.1
5819.872.673.08
6021.472.883.09
6220.822.743.08
6421.5533.11
6625.173.093.12
6829.892.843.16
7028.483.253.09
7228.623.313.2
7428.693.233.07
7630.343.593.1
7830.273.643.13
8030.573.373.14
8231.413.573.06
8431.893.893.09
8633.773.573.05
8836.013.873.1
9037.584.043.07
9240.593.683.04
9440.93.723.09
9641.454.243.1
9842.054.283.13
10041.114.093.03
10242.294.633.11
10442.094.813.17
10642.84.343.14
10843.824.433.02
11043.014.53.12
11244.084.843.1
11445.45.083.16
11647.514.663.13
11847.465.023.17
12048.215.423.17
12249.095.333.21
12449.655.123.1
12650.555.33.11
12850.565.83.17
13052.355.73.17
13253.675.433.14
13452.385.793.26
13651.446.033.18
13852.696.163.08

Re: [sqlite] serious performance problems with indexes

2006-10-28 Thread Peter De Rijk
--On Friday 27 October 2006 15:31, [EMAIL PROTECTED] wrote:
> When a table is indexed, INSERT performance is logorithmic in the
> number of rows in the table and linear in the number of indices.
> This is because entries have to be inserted into the index in
> sorted order (otherwise it wouldn't be an index).  And each
> insert thus requires a binary search.
>
> If your index becomes very large so that it no longer fits
> in your disk cache, then the binary search can involve a lot
> of disk I/O which can make things really slow.  The usual
> work-around here is to keep a much smaller staging table into
> which you do your inserts and then periodically merge the
> staging table into the main table.  This makes your queries
> more complex (and slower) since you are now having to
> search multiple tables.  But it does speed up inserts.
> --
> D. Richard Hipp  <[EMAIL PROTECTED]>
>

I have checked more fine grained (results and code at the end of the  
mail), and it seems indeed that the extreme growth of time needed 
for inserts only starts at around 16 rows (on my system). Before 
that, it also rises, but much much more slowly. As this indeed 
looked like a cache effect, I tried setting the default_cache_size 
to something big (2). This was enough to remove the bend (at 
least till 200 rows), so that the increase in time kept going 
at the same slower pace as for the smaller data sizes.

Curiously, mysql which I included as a reference, manages to 
keep nearly the same insert time even for the largest data sets.
Using the large cache size, sqlite is faster than mysql until about 
70 rows. At 200 rows mysql is about 3 times faster than sqlite
with large cache (vs more than 20 times faster than sqlite with 
default cache size).

--- 
tablesize   sqlite  sqlite-largecachemysql
0   1.121.042.86
2   1.091.172.89
4   1.141.153.27
6   1.231.2 3.18
8   1.471.253.01
10  1.3 1.322.94
12  1.4 1.342.94
14  2.061.383
16  2.411.462.99
18  3.091.543.03
20  3.311.623.04
22  3.841.723.06
24  4.221.7 3.03
26  4.231.763.06
28  4.831.942.98
30  6.272.083.07
32  9.022.063.11
34  9.481.942.88
36  10  2.133.04
38  10.62.213.13
40  10.72   2.353.05
42  11.35   2.243.11
44  11.55   2.493.05
46  11.75   2.5 3.07
48  16.89   2.393.1
50  17.81   2.593.09
52  19.22   2.4 3.03
54  19.35   2.773.05
56  19.44   2.593.1
58  19.87   2.673.08
60  21.47   2.883.09
62  20.82   2.743.08
64  21.55   3   3.11
66  25.17   3.093.12
68  29.89   2.843.16
70  28.48   3.253.09
72  28.62   3.313.2
74  28.69   3.233.07
76  30.34   3.593.1
78  30.27   3.643.13
80  30.57   3.373.14
82  31.41   3.573.06
84  31.89   3.893.09
86  33.77   3.573.05
88  36.01   3.873.1
90  37.58   4.043.07
92  40.59   3.683.04
94  40.93.723.09
96  41.45   4.243.1
98  42.05   4.283.13
100 41.11   4.093.03
102 42.29   4.633.11
104 42.09   4.813.17
106 42.84.343.14
108 43.82   4.433.02
110 43.01   4.5 3.12
112 44.08   4.843.1
114 45.45.083.16
116 47.51   4.663.13
118 47.46   5.023.17
120 48.21   5.423.17
122 49.09   5.333.21
124 49.65   5.123.1
126 50.55   5.3 3.11
128 50.56   5.8 3.17
130 52.35   5.7 3.17
132 53.67   5.433.14
134 52.38   5.793.26
136 51.44   6.033.18
138 52.69   6.163.08
140 53.88   5.673.04
142 52.61   5.863.2
144 53.61   5.923.01
146 52.98   5.983.08
148 54.74   6.273.18
150 55.46.813.18
152 58.45   6.543.12
154 57.14   6.653.12
156 58.92   6.973.17
158 58.17   6.863.22
160 58.29   6.833.15
162 58.19   6.6 3.18
164 58.77   7.3 3.16
166 60.59   7.443.22
168 61.12   7.143.18
170 61.25   7.413.33
172 62.27.183.21
174 61.48   7.953.02
176 62.63   7.913.17
178 62.58   7.523.19
180 62.28   7.563.14
182 64.27.623.33
184 64.18   8.6 3.17
186 64.34   8.713.33
188 63.93   8.443.22
190 62.98   8.133.3
192 63.98.453.29
194 64.85   8.363.21
196 68.26   10.75   4.81
198 67.81   9.463.17

-- code --
proc runtest {sqlfile} {
  set result {}
  set delay 1000
  exec sync; after $delay;
  set t [time "exec sqlite3 def.db <$sqlfile" 1]
  set t [expr {[lindex $t 0]/100.0}]
  lappend result [format %.3f $t]
  exec sync; after $delay;

Re: [sqlite] Regarding sqlite3_open

2006-10-28 Thread Trevor Talbot

On 10/28/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:


I am doing it on Windows. Hence permissions shouldn't be a problem.


NTFS has permissions too.  Sysinternals' FileMon utility may be
useful, as it will show exactly what is going on in terms of file I/O.

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



RE: [sqlite] Regarding sqlite3_open

2006-10-28 Thread ravi.karatagi

I am doing it on Windows. Hence permissions shouldn't be a problem.

Regards,
Ravi K

-Original Message-
From: Lloyd [mailto:[EMAIL PROTECTED]
Sent: Saturday, October 28, 2006 4:49 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Regarding sqlite3_open

Which platform are you using? Do you (application) have enough
permission to create files in the specified location ?

On Sat, 2006-10-28 at 16:19 +0530, [EMAIL PROTECTED] wrote:
> Hi All,
>
> I am using sqlite3_open() in my program. It is returning
> error code 14(file not found) every time.
>
> Where does it search for the file? I even tried with the fully
qualified
> path, Also placing the file in the working directory. Even if the file
> does not exist, it should create the file and return SQLITE_OK.
>
> Please throw some light on it.
>
>
>
> Thanks,
>
> Regards,
>
> Ravi K
>
>
>
>
>
>
> The information contained in this electronic message and any
attachments to this message are intended for the exclusive use of the
addressee(s) and may contain proprietary, confidential or privileged
information. If you are not the intended recipient, you should not
disseminate, distribute or copy this e-mail. Please notify the sender
immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient
should check this email and any attachments for the presence of viruses.
The company accepts no liability for any damage caused by any virus
transmitted by this email.
>
> www.wipro.com


__
Scanned and protected by Email scanner


-
To unsubscribe, send email to [EMAIL PROTECTED]

-



The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] Compatibility sqlite 2.8 and 3.x

2006-10-28 Thread victor . camus
Hi all,

Is it possible to make requests on a database created with 2.8.(14 or 17) with
an (tcl) API developped for sqlite 3.0?

I am still trying to find a way to create/have acces from OpenWatcom Fortran.
Since I noted that the version 2.8.14 of sqlite can be compiled with Openwatcom
(is it also true with 2.8.17 available on the sqlite homepage?), the idea would
be to create a sqlite database in 2.8 and to access it in Fortran with a sqlite
library in 2.8 and a tcl API in 3.0.

Does it make sense?

Victor

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Regarding sqlite3_open

2006-10-28 Thread Lloyd
Which platform are you using? Do you (application) have enough
permission to create files in the specified location ?

On Sat, 2006-10-28 at 16:19 +0530, [EMAIL PROTECTED] wrote:
> Hi All,
> 
> I am using sqlite3_open() in my program. It is returning
> error code 14(file not found) every time.
> 
> Where does it search for the file? I even tried with the fully qualified
> path, Also placing the file in the working directory. Even if the file
> does not exist, it should create the file and return SQLITE_OK.
> 
> Please throw some light on it.
> 
> 
> 
> Thanks,
> 
> Regards,
> 
> Ravi K
> 
> 
> 
> 
> 
> 
> The information contained in this electronic message and any attachments to 
> this message are intended for the exclusive use of the addressee(s) and may 
> contain proprietary, confidential or privileged information. If you are not 
> the intended recipient, you should not disseminate, distribute or copy this 
> e-mail. Please notify the sender immediately and destroy all copies of this 
> message and any attachments.
> 
> WARNING: Computer viruses can be transmitted via email. The recipient should 
> check this email and any attachments for the presence of viruses. The company 
> accepts no liability for any damage caused by any virus transmitted by this 
> email.
> 
> www.wipro.com


__
Scanned and protected by Email scanner

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] Regarding sqlite3_open

2006-10-28 Thread ravi.karatagi

Hi All,

I am using sqlite3_open() in my program. It is returning
error code 14(file not found) every time.

Where does it search for the file? I even tried with the fully qualified
path, Also placing the file in the working directory. Even if the file
does not exist, it should create the file and return SQLITE_OK.

Please throw some light on it.



Thanks,

Regards,

Ravi K






The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com

Re: [sqlite] Re: Regarding sqlite3_exec

2006-10-28 Thread Da Martian

Hi

Thanks for the reposnse. The main reason is my record count could be from a
few thousands to a million. But even at the lowly numbers of around 1
the interface can seem slugish if you read every record before displaying
anything.

As you mention, and has been disucssed above, doing stuff in the background
is good way to go, but more complex. As a generla rule of coding I put as
few unneccessary threads into a "phase 1" program as I can, because the
complexity goes up hugly, threads can be complex to use, co-ordinate,
interrupt etc... and chance of bugs goes up drmatically. So I tend to do
that for a "Phase 2" - Bells and whistles phase and only when there isnt a
simpler way.

This thread has covered just about all approaches I can think of :-) thanks
for the reponses.

S


On 10/27/06, Isaac Raway <[EMAIL PROTECTED]> wrote:


Why don't you design the table with a unique row ID, stored in an
integer field, then fetch a list of those ID numbers?

For 5000 rows, assuming you store them in you application as 4 byte
longs, that's about 19 k of memory.

Counting that result as you receive it isn't that difficult. If it
takes a long time (it probably won't) you can do it in another thread
and update the interface as appropriate.

I'm not seeing a downside here.

Isaac

On 10/26/06, Da Martian <[EMAIL PROTECTED]> wrote:
> No there isnt, but RDBM systems are a generalised data retrieval
mechanism.
> As such they suffer from that generality.
>
> Dont get me wrong, RDBM systems  are appropriate for 95% of all data
> requirements I have had to deal with and I would never dream of trying
to
> write one from scratch, nor can I imagine a world without them.
>
> However certain applications (Weather data, Gnome data, Large indices
(like
> google)) require using somethng designed specifically for that purpose.
If
> you customise data retrieval (and particluar your sorting/indcies/access
> path) you can leave rdbms in the dust in terms of performance. All I
have
> read about google, suggests they do exactly this. Although I must point
out,
> I dont actually know anything about google with any certainty. Just what
has
> "leaked" out over the years on the rumour mill. But designiing my own
> "google" like indices (on a smaller scale of coure) and some
specialisted
> weather stuff, it neccessary to throw away the rdbms and do it yourself.
For
> a goole query for instance, they know they will get a list of 1 or more
> words. They also know they will only ever search through the index of
words.
> They dont have other data types, records or tables. Why go through all
the
> hassles of compiling SQLs, and that generic overhead when your
application
> will only ever do one thing? You can just make an API like this
> "search(wordlist): Resultset. "
>
> You immediatly save yourself complexity and processing time. Then for
large
> indices you will know your data set, so instead of using a std BTree you
> would use a more appropraite DS possible with skip lists etc..
> .
> As for performing a database search twice, this whole thread has shown,
that
> sometimes the you have to :-)
>
> S
>
> On 10/25/06, John Stanton <[EMAIL PROTECTED]> wrote:
> >
> > There is no magic in data retrieval.  Google use the same physical
laws
> > as us ordinary mortals.
> >
> > I see no reason to ever perform a dataabase search twice.
> >
>
>


--
Isaac Raway
Entia non sunt multiplicanda praeter necessitatem.

http://blueapples.org - blog
http://stonenotes.com - personal knowledge management


-
To unsubscribe, send email to [EMAIL PROTECTED]

-