[sqlite] Using SQLite in threadly Web server applications (was: Using SQLite to record Web log file data (a threading question))
Those suggestions are great. For something like log analysis, there's no problem either reading in log file tails ever so often or queuing up a single thread to give more real-time access. I think I'm leaning towards a more real-time view, but I might take the tail option if it codes easier. On the topic of threads, though, this does open a pile of other "running behind a Web server" sort of operations: * To keep a db of user authentication data in an SQLite db, users will need some level of real-time response to logins * To keep a db of, say, a discussion forum, user posts will need to be written and made available in realish-time * To update Web pages dynamically with information populated from a user's preference or users' usage patterns, the page will need to be generated in real time And each of these interactions with the server will occur with different user sessions in different threads. Can SQLite be used to build applications that do the operations above? Do I need to do some single-thread queue structure for each approach? I'd love some guiding thoughts on these things to help understand how to really approach the problem. Thanks! -- David - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] serious performance problems with indexes
The last test that simulated pre-sorting 2 million rows in index order prior to insert may show optimal insert speed since it's only appending pages to the table and the index, but it may not be realistic, since you may not have sufficient memory to pre-sort all data in memory prior to insert. (Although if you have the memory to spare, this is the way to go). Here's another test on the same machine that uses random inputs but inserts batches of 2 pre-sorted rows at time. You can see that insert time increases with the size of the table, but it is not exponential as was the case with the original test program. You can also see that having a large sqlite cache does not significantly effect the insert timings under 2M rows as compared to the default sqlite cache. Tcl code change follows the data. tablesize sqlite sqlite-largecache 0 1.909 1.909 2 2.019 2.048 4 2.064 2.079 6 2.111 2.079 8 2.142 2.158 10 2.157 2.157 12 2.282 2.235 14 2.298 2.267 16 2.392 2.298 18 2.439 2.314 20 2.361 2.345 22 2.501 2.439 24 2.689 3.439 26 2.704 3.423 28 2.673 3.564 30 2.767 3.533 32 2.829 3.564 34 2.876 2.657 36 2.955 2.626 38 3.017 3.673 40 3.001 3.704 42 3.048 2.767 44 2.970 3.783 46 3.048 2.845 48 3.095 3.861 50 3.157 3.861 52 3.173 3.923 54 3.142 3.970 56 3.236 3.986 58 3.220 4.002 60 3.408 4.970 62 3.267 4.142 64 3.423 5.017 66 3.454 4.189 68 3.611 4.220 70 3.611 5.142 72 3.704 5.251 74 3.829 4.330 76 3.657 4.330 78 3.970 4.455 80 4.001 4.470 82 4.046 4.454 84 4.173 4.470 86 4.142 4.611 88 4.126 4.548 90 4.298 4.501 92 4.517 4.642 94 4.641 4.689 96 4.517 5.579 98 4.673 4.736 100 4.736 4.751 102 4.689 4.845 104 4.923 4.798 106 5.111 4.939 108 5.204 4.314 110 5.048 4.876 112 5.064 5.814 114 5.205 5.876 116 5.314 5.876 118 5.501 5.142 120 6.517 5.032 122 5.736 5.048 124 5.861 6.986 126 5.892 5.986 128 5.767 5.298 130 6.095 6.095 132 6.080 5.329 134 5.954 5.283 136 6.548 5.986 138 6.157 5.423 140 6.642 6.330 142 6.439 5.643 144 6.705 5.579 146 6.626 5.611 148 6.814 6.517 150 6.720 6.533 152 7.111 6.579 154 7.142 5.845 156 7.533 6.173 158 7.345 6.064 160 7.689 6.142 162 7.798 6.283 164 7.876 6.486 166 7.704 6.408 168 8.283 6.470 170 8.455 6.517 172 8.126 8.049 174 8.314 7.720 176 8.564 7.017 178 8.845 7.095 180 8.814 7.079 182 9.126 7.002 184 8.798 7.174 186 9.189 7.236 188 9.111 7.184 190 9.986 11.830 192 9.658 7.673 194 9.876 9.126 196 9.892 7.908 198 9.626 7.580 set o [open results.csv w] puts $o "tablesize\tsqlite\tsqlite-largecache\t mysql" set step 2 for {set num 0} {$num < 200} {incr num $step} { puts $num set fd [open test.sql w] set rows [list] for {set i 1} {$i<=$step} {incr i} { lappend rows [list [expr {$num+$i}] [expr {int(rand()*50)}]] } puts $fd "BEGIN;" foreach i [lsort -integer -index 1 $rows] { puts $fd "INSERT INTO t1 VALUES([lindex $i 0], [lindex $i 1]);" } puts $fd "COMMIT;" close $fd puts $o $num\t[join [runtest test.sql] \t] flush $o } close $o - Original Message From: Joe Wilson <[EMAIL PROTECTED]> To: sqlite-users@sqlite.org Sent: Saturday, October 28, 2006 1:08:20 PM Subject: Re: [sqlite] serious performance problems with indexes If you pre-sort the data prior to inserting into sqlite you will see much better timings for both the default cache size and the larger cache size. When I run your script (unmodified, except for the removal of MySQL) up to 40: tablesize sqlite sqlite-largecache 0 1.956 1.956 2 2.064 2.079 4 2.095 2.111 6 2.173 2.173 8 2.189 2.283 10 2.345 2.298 12 2.345 2.360 14 2.407 2.361 16 2.470 2.423 18 2.548 2.501 20 2.564 2.439 22 2.830 2.439 24 2.876 2.907 26 2.829 2.564 28 3.423 3.533 30 4.939 3.564 32 7.236 3.736 34 7.283 3.751 36 10.611 3.767 38 11.142 3.845 40 13.736 3.798 When I make the following change to your script to simulate an ordered data set (okay, it's cheating slightly - no pre-sort need be performed here): # set r [expr {int(rand()*50)}] set r [expr {($num+$i)*5}] tablesize sqlite sqlite-largecache 0 1.878 1.894 2 1.925 1.923 4 1.923 1.923 6 1.923 1.954 8 1.970 1.923 10 1.923 1.908 12 1.923
Re: [sqlite] Using SQLite to record Web log file data (a threading question)
[EMAIL PROTECTED] wrote: "David Gewirtz" <[EMAIL PROTECTED]> wrote: I've been exploring SQLite for a number of applications, but one I'd like to embark on soon is using SQLite to record Web site log file data, so I can perform SQL-based analysis on my logs, rather than using some thing like Analog. Unfortunately, each Web access on the server is likely to be in its own thread. The scenario I'd probably have to build might go something like this: * Web server launches * SQLite issues sqlite3_open to log db, gets log_ID * Web server handles user 1 in thread 1, which writes to log db * Web server handles user 2 in thread 2, which writes to log db * Web server handles user n in thread n, which writes to log db * Web server handles admin request for log analysis, which reads from log db * Web server begins shutdown, closes log db * Web server shuts down From my reading, it's just not clear to me whether this is bad behavior for SQLite 3.3.7. Can SQLite handle this sort of sequence reliably. If not, any suggestions about how I might proceed or how I should think about it? The way I handle this at www.sqlite.org is that web log data just gets appended to an ordinary text file. Then when I want to do analysis, I make a copy of the text file and import it into an SQLite database. I think do queries against the SQLite database to extract the information I want to know. You could perhaps automate this so that a background task took the unprocessed tail of your log file and added it an SQLite database every 10 minutes or so. Or every 30 seconds. Just rig it so that you only have one process trying to write at a time and so that you do not have to take transaction overhead for every single web hit. SQLite itself is not the best tool for doing multiple concurrent writes from different threads or processes. -- D. Richard Hipp <[EMAIL PROTECTED]> - To unsubscribe, send email to [EMAIL PROTECTED] - I made Sqlite accept log data and analysis from a web server. From a multi-threaded web server I used a seperate thread to handle the DB insertion. It is driven from a queue of log transactions posted by the active threads. That way it is single streamed and has no contention problems and the active threads have a very fast mechanism to post their log data, just queue a pointer to the buffer used by the thread. Each time a thread activates it gets a freed buffer from a pool. When the log thread finishes a transaction it drops the buffer onto the free list. The advantage of this approach is that it gives real time log analysis. To that end some frequently accessed summaries are maintained. Just to maintain the log in a simple form in an Sqlite database offers little, if any, advantage over keeping a text file and analysing it from time to time. - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Using SQLite to record Web log file data (a threading question)
On 10/29/06, David Gewirtz <[EMAIL PROTECTED]> wrote: I've been exploring SQLite for a number of applications, but one I'd like to embark on soon is using SQLite to record Web site log file data, so I can perform SQL-based analysis on my logs, rather than using some thing like Analog. Unfortunately, each Web access on the server is likely to be in its own thread. The scenario I'd probably have to build might go something like this: * Web server launches * SQLite issues sqlite3_open to log db, gets log_ID * Web server handles user 1 in thread 1, which writes to log db * Web server handles user 2 in thread 2, which writes to log db * Web server handles user n in thread n, which writes to log db * Web server handles admin request for log analysis, which reads from log db * Web server begins shutdown, closes log db * Web server shuts down From my reading, it's just not clear to me whether this is bad behavior for SQLite 3.3.7. Can SQLite handle this sort of sequence reliably. If not, any suggestions about how I might proceed or how I should think about it? If you really need real-time insight into your logs, then see http://www.sqlite.org/sharedcache.html and http://www.sqlite.org/cvstrac/fileview?f=sqlite/src/server.c=1.6 for example implementation and some more docs on the subject. Though if you can live with storing logs to text files and reading them into SQLite db periodically and/or on demand (if that will happen rare enough to make more sense than above scenario) it will probably be much simpler to implement. -- Nemanja Corlija <[EMAIL PROTECTED]> - To unsubscribe, send email to [EMAIL PROTECTED] -