On Sat, Aug 29, 2015 at 4:16 AM, Yahoo! Mail <stefanossofroniou542 at yahoo.com > wrote:
> On 08/28/2015 09:36 PM, Scott Robison wrote: > >> On Fri, Aug 28, 2015 at 3:47 AM, Yahoo! Mail < >> stefanossofroniou542 at yahoo.com >> >>> wrote: >>> >>> On 08/26/2015 09:03 PM, Richard Hipp wrote: >>> >>> Time stands still for multiple rows, as long as they are within the >>> >>>> same sqlite3_step() call. For example, if you run: >>>> >>>> CREATE TABLE t1(a DATETIME); >>>> WITH RECURSIVE >>>> c(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM c WHERE >>>> x<1000000000) >>>> INSERT INTO t1(a) SELECT datetime('now') FROM c; >>>> >>>> The entire INSERT statement will be a single sqlite3_step() call, and >>>> so all billion rows of t1 will get set to the same time. >>>> >>>> I have tested this code with version 3.8.10.2 using "DB Browser for >>> SQLite" and it would crash; the same with SQLite Manager that uses the >>> same >>> version. I have decided to use the latest SQLite version available from >>> GNU >>> / Linux Debian testing 64-bit; it's 3.8.11.1 and something strange is >>> happening. I hope I'm the reason behind its rather peculiar behavior. >>> >>> I tested the following code >>> >>> *sqlite> drop table if exists t1; create table t1(a datetime); begin; >>> with >>> recursive c(x) as (values(1) union all select x + 1 from c where x < >>> 1000000000) insert into t1(a) select datetime('now') from c; commit;* >>> >>> It would take ages to finish and that is logical; it's a billion rows it >>> has to insert, even though I have used begin - commit. >>> >>> Now the rather strange behavior would be when I use the up arrow key to >>> reuse the previous code but replacing *1000000000* with *10*. It takes >>> the >>> same time to finish as before and that is not right; I'm dropping the >>> table >>> and it should take nearly 0 milliseconds for 10 rows. >>> >>> Can someone confirm this? >>> >>> I have been using 3.8.11, so I just downloaded 3.8.11.1 to test your >> exact >> query, cut & pasted from above. Here is the output from my session: >> >> X:\>sqlite3 >> SQLite version 3.8.11.1 2015-07-29 20:00:57 >> Enter ".help" for usage hints. >> Connected to a transient in-memory database. >> Use ".open FILENAME" to reopen on a persistent database. >> sqlite> .timer on >> sqlite> drop table if exists t1; create table t1(a datetime); begin; with >> recursive c(x) as (values(1) union all select x + 1 from c where x < >> 10000000) insert into t1(a) select datetime('now') from c; commit; >> Run Time: real 16.751 user 16.520506 sys 0.062400 >> sqlite> drop table if exists t1; create table t1(a datetime); begin; with >> recursive c(x) as (values(1) union all select x + 1 from c where x < 10) >> insert into t1(a) select datetime('now') from c; commit; >> Run Time: real 0.387 user 0.374402 sys 0.000000 >> sqlite> >> >> Note: I pasted the first query that begins drop table, then used up arrow >> and edited the 10000000 to 10 for the second query. Further, I didn't have >> time to try a full billion, so I settled for 10 million. Also used a >> transient in-memory database. Finally, I am using Windows 7 Professional >> on >> a box with 16 GiB of RAM. >> > I tested it with a local database and not with an in-memory one. I wanted > someone to test it so we could see if it's something that has to do with a > reasonable disk access restriction after a certain number of multiple > attempts. I misunderstood. I inferred from your original message that you thought scrolling back through the history and editing the query resulted in the original query still being executed. SQLite version 3.8.11.1 2015-07-29 20:00:57 Enter ".help" for usage hints. sqlite> .timer on sqlite> drop table if exists t1; create table t1(a datetime); begin; with ...> recursive c(x) as (values(1) union all select x + 1 from c where x < ...> 1000000000) insert into t1(a) select datetime('now') from c; commit; Run Time: real 3087.125 user 2534.593750 sys 329.078125 sqlite> drop table if exists t1; create table t1(a datetime); begin; with ...> recursive c(x) as (values(1) union all select x + 1 from c where x < ...> 10) insert into t1(a) select datetime('now') from c; commit; Run Time: real 764.309 user 84.031250 sys 132.812500 In watching performance, most of the time spent on the second query seemed to be the drop table command, and it was primarily disk IO. Note that user + sys is only about 216 seconds out of 764. That leaves a lot of time for disk IO. So I repeated the first query and only did the drop table part of the second: sqlite> drop table if exists t1; create table t1(a datetime); begin; with ...> recursive c(x) as (values(1) union all select x + 1 from c where x < ...> 1000000000) insert into t1(a) select datetime('now') from c; commit; Run Time: real 4203.249 user 2536.750000 sys 320.281250 sqlite> drop table if exists t1; Run Time: real 4322.951 user 146.187500 sys 698.984375 Note inserting one billion records took 1200 seconds longer the second time (where I was reusing the same database file). I would imagine that this is due to the fact that it is faster to extend the file by a page (one seek and one write) than it is to update the freelist to make a page available (seek, read, seek write). 1200 seconds over the course of 28,000,000 pages isn't much on a per page basis, but it adds up. More interesting, the user and sys times are statistically identical between the first run of one billion and the second. That tells us the difference was IO driven (or that utilization of my computer went way up, or a combination of the two). The real time is *way* off what I expected, but the usage profile of the computer was very different by then. The user and sys times were somewhat higher, but not horribly (depending on your definition of horrible). Finally, I did the rest of the second set of queries for 10 rows: sqlite> create table t1(a datetime); begin; with ...> recursive c(x) as (values(1) union all select x + 1 from c where x < ...> 10) insert into t1(a) select datetime('now') from c; commit; Run Time: real 0.181 user 0.000000 sys 0.000000 In any case, one billion records involves (on my machine) over 28 million database pages. It takes a certain amount of time to add all those pages to the freelist (on my machine it was able to update the freelist at the rate of about 37,500 pages per second). Dropping a table that large means a lot of bookkeeping has to be done. > > > Do you see the same behavior for a test of 10M rows (that the second line >> takes the same amount of time as the first)? >> >> > _______________________________________________ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > -- Scott Robison