Greetings, I'm running into some performance problems with my insert operations using sqlite 3.35. OS is linux with a 2.69 kernel using an out-of-box ext3 file system.
My schema consists of a single table with 12 columns with a numeric affinity and 5 columns with text affinity. The table has a primary integer key on a monotonically increasing row id number. My first experiment proceeds as follows: Using a single thread, I insert 5 million rows containing around 0.5 KB of data to the table. These inserts occur in transactions of 1000 inserts each. In this case, performance is very fast. The entire experiment takes around 10 minutes to run. Typical transaction time is between 0.1 and 0.2 seconds for every 1000 inserts. Next, I add in an index on of my numeric columns. The value of this column is a random 32 bit integer. This is where the problem comes in. At first, the experiment is fast, and I see transaction times similar to what I've observed without the index. Then, after around 800,000 entries, I start seeing drastically larger transaction times of around 2-5 seconds per transaction. After around 2,500,000 entries, these times are consistently hovering around 5 seconds. At that point I killed the process. With the help of I/O stat, it seems that at the point that things get slow we're seeing an I/O wait of around 50%. Towards the end of the experiment, I/O wait is around 85%. Additionally, at the beginning of the experiment, I'm observing a negligible amount of I/O read. When it begins to get slow, I/O occurs at around 1/10th the rate of I/O write. I suspect that this had something to do with the way sqlite caches my index. I modified the code to call sqlite3_enable_shared_cache. It then created a dummy table, inserted a row, and then performed a select on it. In the callback for the select, I then proceeded with the experiment (this is similar to how Mozilla's storage module enables the shared cache). In this case, performance was slightly worse at first (0.4 seconds for each transaction, I suspect that's because it's cheaper to use the buffer cache). As the experiment proceeded, however, the transaction times were consistent until I filled SQLite's cache. Even with an absurdly high cache PRAGMA (~500 MB), it managed to fill after around 1,000,000 rows. At that point, I began to see the dreaded 5 second transaction times again. I tried a few more things (periodically dumping the cache by closing the connections, then "priming" it by doing a select on the value of the index), but I didn't have any luck speeding things up. Anyone care to take a guess what's going on here? Is it a problem with rebalancing the btree that's storing the index? It seems to me that something is doing random I/O, and I was hoping that if I could "force" the index's pages to stay in the cache, then I'd be able to avoid that. Any help is appreciated, -Stan